视频也可以含恶意代码吗

Note

本题为 “中国高校智能机器人创意大赛产教融合赛道 - 软件系统安全赛” 的 re1，可惜的是我在比赛期间没有解出。只能说恐怕离了 AI 我就啥也不是了哎 😔

1. 题目背景

本题的参考项目应该是这个：Malicious-PixelCode。Github 提交记录显示是 25 年 11 月上传的，算是一个 Real World 题目了。

首先，计算机中的视频本质上仍是一串字节序列，只要是字节序列就会有程序可以读取。

你可以观看不依赖三方库和框架，直接操纵像素在 CPU 上跑 Shader | Tsoding，看看纯代码是如何生成简单图像和视频的。

像素代码 (Pixel Code) 是以可视化像素的形式表示二进制数据的技术。每个像素通过将可执行字节转换为结构化的彩色矩阵来编码原始文件的一小部分，数据可以存储、传输或嵌入图像或视频中而不暴露原始形式。

该技术常被用于研究数据混淆、隐蔽存储和非常规编码通道。通过将恶意软件转化为基于像素的表示形式，攻击者可能试图将其负载伪装成无害的多媒体内容。这些编码文件随后可以上传到合法平台，绕过传统的安全过滤器，向外界传达一种可信赖的假象。攻击者控制的加载器或解码器随后可以检索视频并重建原始恶意文件。

整体流程是这样的：

Pixel Code Workflow

编写攻击负载并构建为可执行文件
生成的可执行文件会通过专门设计用于将二进制数据转换为像素代码的基于 Python 的工具进行处理。该工具将可执行文件转换为可视化的 MP4 视频，二进制字节以像素值的形式嵌入帧中。
生成 Pixel Code MP4 文件后，会作为标准视频上传到视频平台。
编写 Python Stager 脚本以从视频中重建代码
将 Stager 脚本转换为编码序列（这里是先转成 EXE 再 Base64 一次），嵌入加载器代码
构建加载器

当目标机器下载并运行加载器后，加载器会自动从视频网站下载视频。然后加载器重建 Stager 脚本，Stager 脚本再从视频里重建攻击负载，最后攻击负载执行，实现攻击目的。

2. 题目分析

本题给了两个文件：ELF 格式的 Loader 和 video.mp4。打开 video.mp4 的话会看到一系列条码在 “滚动”。

加载 Loader 会发现附带调试信息，因此总体是比较清晰的：

Loader

int __fastcall main(int argc, const char **argv, const char **envp)
{
  __int64 v3; // rax
  __int64 v4; // rax
  __int64 v5; // rax
  int v6; // ebx
  __int64 v7; // rax
  const char *file; // rax
  _BYTE v10[9]; // [rsp+7h] [rbp-99h] BYREF
  _BYTE *v11; // [rsp+10h] [rbp-90h]
  _BYTE *v12; // [rsp+18h] [rbp-88h]
  __int64 videoFile[4]; // [rsp+20h] [rbp-80h] BYREF
  __int64 src[4]; // [rsp+40h] [rbp-60h] BYREF
  __int64 File[7]; // [rsp+60h] [rbp-40h] BYREF

  File[5] = __readfsqword(0x28u);
  *(_QWORD *)&v10[1] = v10;
  std::string::basic_string(videoFile, "video.mp4", v10);
  std::__new_allocator<char>::~__new_allocator(v10);
  if ( file_exists(videoFile) )
  {
    v11 = v10;
    std::string::basic_string(
      File,
      stager_pyc_base64,                        // "delete"
      v10);                                     // "delete"
    base64_decode(src, File);
    std::string::~string();
    std::__new_allocator<char>::~__new_allocator(v10);
    v12 = v10;
    std::string::basic_string(File, "stager.pyc", v10);
    std::__new_allocator<char>::~__new_allocator(v10);
    if ( (unsigned __int8)write_string_to_file((__int64)File, src) == 1 )
    {
      file = (const char *)std::string::c_str(File);
      chmod(file, 493u);
      run_python_script(File);
      v6 = 0;
    }
    else
    {
      v7 = std::operator<<<std::char_traits<char>>(&std::cerr, "创建 stager.pyc 失败");
      std::ostream::operator<<(v7, &std::endl<char,std::char_traits<char>>);
      v6 = 1;
    }
    std::string::~string();
    std::string::~string();
  }
  else
  {
    v3 = std::operator<<<std::char_traits<char>>(&std::cerr, "错误：未找到视频文件 ");
    v4 = std::operator<<<char>(v3, videoFile);
    std::ostream::operator<<(v4, &std::endl<char,std::char_traits<char>>);
    v5 = std::operator<<<std::char_traits<char>>(&std::cerr, "请确保 video.mp4 位于当前目录");
    std::ostream::operator<<(v5, &std::endl<char,std::char_traits<char>>);
    v6 = 1;
  }
  std::string::~string();
  return v6;
}

Loader 的逻辑是：

graph LR
  A(检查 video.mp4 是否存在) -->|是| B(加载并解码脚本字节序列)
  A -->|否| C(抛出错误)
  B --> D(将解码结果写入 stager.pyc)
  D -->|成功| E(加权限，执行脚本)
  D -->|失败| F(抛出错误)

这里出题者将 Stager.pyc 的字节序列删除了，只留下 "delete"，但是查看 .data 段会发现攻击用的负载编码器的编码序列没有删除。将其 dump 下来并保存为 .pyc 文件，反编译得到：

payload.py

from PIL import Image
import math
import os
import sys
import numpy as np
import imageio
from tqdm import tqdm

def file_to_video(input_file, width, height, pixel_size, fps, output_file = (640, 480, 8, 10, 'video.mp4')):
    if not os.path.isfile(input_file):
        return None
    file_size = None.path.getsize(input_file)
    binary_string = ''
    with None:
        f = open(input_file, 'rb')
        for chunk in None(None((lambda : f.read(1024)), b''), math.ceil(file_size / 1024), 'KB', '读取文件', **('iterable', 'total', 'unit', 'desc')):
            binary_string += ''.join((lambda .0: pass)(chunk))
    xor_key = '10101010'
    xor_binary_string = ''
    for i in range(0, len(binary_string), 8):
        chunk = binary_string[i:i + 8]
        if len(chunk) == 8:
            chunk_int = int(chunk, 2)
            key_int = int(xor_key, 2)
            xor_result = chunk_int ^ key_int
            xor_binary_string += f'''{xor_result:08b}'''
            continue
        xor_binary_string += chunk
    
    binary_string = xor_binary_string
    pixels_per_image = (width // pixel_size) * (height // pixel_size)
    num_images = math.ceil(len(binary_string) / pixels_per_image)
    frames = []
    for i in tqdm(range(num_images), '生成视频帧', **('desc',)):
        start = i * pixels_per_image
        bits = binary_string[start:start + pixels_per_image]
        if len(bits) < pixels_per_image:
            bits = bits + '0' * (pixels_per_image - len(bits))
        img = Image.new('RGB', (width, height), 'white', **('color',))
        for r in range(height // pixel_size):
            row_start = r * (width // pixel_size)
            row_end = (r + 1) * (width // pixel_size)
            row = bits[row_start:row_end]
            for c, bit in enumerate(row):
                color = (0, 0, 0) if bit == '1' else (255, 255, 255)
                x1 = c * pixel_size
                y1 = r * pixel_size
                img.paste(color, (x1, y1, x1 + pixel_size, y1 + pixel_size))
            
        
        frames.append(np.array(img))
    
    with imageio.get_writer(output_file, fps, 'libx264', **('fps', 'codec')) as writer:
        for frame in tqdm(frames, '写入视频帧', **('desc',)):
            writer.append_data(frame)

if __name__ == '__main__':
    input_path = 'payload'
    if os.path.exists(input_path):
        file_to_video(input_path)
    else:
        sys.exit(1)

对照原始项目脚本，将原始项目脚本修改为：

payload_fixed.py

from PIL import Image
import math, os
import numpy as np
from moviepy import ImageSequenceClip
from tqdm import tqdm

# tqdm 是进度条的库

def file_to_video(input_file, width=640, height=480, pixel_size=8, fps=10):
    if not os.path.isfile(input_file):
        print(f"Error: File '{input_file}' does not exist.")
        return

    file_size = os.path.getsize(input_file)
    binary_string = ""
    with open(input_file, "rb") as f:
        for chunk in tqdm(iterable=iter(lambda: f.read(1024), b""), 
                         total=math.ceil(file_size/1024), unit="KB"):
            binary_string += "".join(f"{byte:08b}" for byte in chunk)

    xor_key = '10101010'
    xor_binary_string = ''
    for i in range(0, len(binary_string), 8):
        chunk = binary_string[i:i + 8]
        if len(chunk) == 8:
            chunk_int = int(chunk, 2)
            key_int = int(xor_key, 2)
            xor_result = chunk_int ^ key_int
            xor_binary_string += f'''{xor_result:08b}'''
            continue
        xor_binary_string += chunk
    binary_string = xor_binary_string

    pixels_per_image = (width // pixel_size) * (height // pixel_size)
    num_images = math.ceil(len(binary_string) / pixels_per_image)
    frames = []

    for i in tqdm(range(num_images), desc="Generating frames"):
        start = i * pixels_per_image
        bits = binary_string[start:start + pixels_per_image]
        img = Image.new('RGB', (width, height), color='white')
        
        if len(bits) < pixels_per_image:
            bits = bits + '0' * (pixels_per_image - len(bits))

        for r in range(height // pixel_size):
            row = bits[r * (width // pixel_size):(r + 1) * (width // pixel_size)]
            for c, bit in enumerate(row):
                color = (0, 0, 0) if bit == '1' else (255, 255, 255)
                x1, y1 = c * pixel_size, r * pixel_size
                img.paste(color, (x1, y1, x1 + pixel_size, y1 + pixel_size))
        
        frames.append(np.array(img))
        # print(frames)

    clip = ImageSequenceClip(frames, fps=fps)
    clip.write_videofile('video.mp4', fps=fps, codec='libx264')
    print("Video generated successfully: video.mp4")



if __name__ == "__main__":
    print("Convert file → video")
    path = input("Enter file path: ")
    file_to_video(path)

脚本主要分成两部分：将程序转化成二进制流并作按位异或处理；将二进制流以 8bits 为一个单位，设定位置和颜色，最终生成一个 640 × 480 @ 10FPS 的 mp4 视频文件。

Info

我怎么发现这个 Github 项目的呢？我在解码 pyc 文件后，发现解码结果中间有一些合法的文件名，比如 Payload_To_PixelCode_video.py，于是我就拿到 Bing 上搜，一下就搜到了这个项目（

现在正向转换我们知道是怎么回事了，但逆过来呢？

背景项目提供了一个解码器：

Stager_To_PixelCode_video.py

import imageio
import numpy as np
import os

def frames_to_bits_auto(frames):
    bits_list = []
    for frame in frames:
        gray = np.mean(frame, axis=2)
        h, w = gray.shape
        pixel_size_h = max(1, h // 256)
        pixel_size_w = max(1, w // 256)
        pixel_size = min(pixel_size_h, pixel_size_w)
        h_blocks = h // pixel_size
        w_blocks = w // pixel_size
        cropped = gray[:h_blocks * pixel_size, :w_blocks * pixel_size]
        reshaped = cropped.reshape(h_blocks, pixel_size, w_blocks, pixel_size)
        block_means = reshaped.mean(axis=(1, 3))
        block_bits = (block_means < 128).astype(np.uint8)
        bits_list.append(block_bits.ravel())
    if not bits_list:
        return np.array([], dtype=np.uint8)
    return np.concatenate(bits_list)

def bits_to_file(bits, output_file):
    remainder = bits.size % 8
    if remainder:
        bits = bits[: bits.size - remainder]
    if bits.size == 0:
        return False
    packed = np.packbits(bits, bitorder='big')
    try:
        with open(output_file, 'wb') as f:
            f.write(packed.tobytes())
        return True
    except:
        return False

def video_to_exe(video_path, output_name="Final_Result"):
    if not os.path.exists(video_path):
        return False
    try:
        reader = imageio.get_reader(video_path, 'ffmpeg')
    except:
        return False

    frames = []
    try:
        for frame in reader:
            frames.append(frame)
    except:
        reader.close()
        return False
    reader.close()

    bits = frames_to_bits_auto(frames)
    return bits_to_file(bits, output_name)

if __name__ == "__main__":
    video_file = "Pixel_Code_Video.mp4"
    out_exe = "Final_Result"
    success = video_to_exe(video_file, out_exe)
    if success and os.path.exists(out_exe):
        try:
            print("Reconstruction Success.\n")
            # os.startfile(out_exe)  
        except:
            print("Reconstruction Failed.\n")

对比原项目提供的编码器，我们发现题目中编码器只修改了两点：一是程序字节转二进制流时的按位异或，二是二进制流数据不够时补 0。第二点在逆向时不需要关注，因此我们只需要确定这个异或操作应该出现在解码器的哪个位置。

从编码器来看，异或操作位于程序字节转成二进制流之后，那么编码器生成二进制流之后需要经过一次异或才是原始程序字节。逆异或操作位于 numpy 打包完成二进制流之后：

# ...

def undo_xor(data):
    return bytes([b ^ 0xAA for b in data]) # 0xAAh = 10101010b

def bits_to_file(bits, output_file):
    remainder = bits.size % 8
    if remainder:
        bits = bits[: bits.size - remainder]
    if bits.size == 0:
        return False
    packed = np.packbits(bits, bitorder='big')
    decoded = undo_xor(packed.tobytes())
    try:
        with open(output_file, 'wb') as f:
            f.write(decoded)
        return True
    except:
        return False

# ...