Skip to content

WordStegano

stegobox.codec.WordStegano

Bases: BaseCodec

This algorithm can hide strings in a word document.

Where different xml files of Word documents correspond to different word attributes. The document.xml file is the main document of the package, which records the textual content and other relevant attributes of the Word XML file; in XML-based Office documents, the most basic unit is the element, which can carry several attributes and their values as additional information. It is on the basis of such attributes that encryption and decryption is performed.

"Identification attributes" are generally used to distinguish data or attributes such as text, tables, etc., and are characterized by having unique and randomly generated attribute values that are not related to the user or the time of modification. Changes to the value of the identity attribute do not affect the content of the text. Therefore, the purpose of hiding information in Office XML documents can be achieved by writing the information to be hidden in the attribute values.

  • Created by: QiuYu
  • Created time: 2022/12/16

Originally implemented in 1664587146/xml-word-Steganography

Source code in stegobox/codec/word_stegano/word_stegano.py
class WordStegano(BaseCodec):
    """
    This algorithm can hide strings in a word document.

    Where different xml files of Word documents correspond to different word attributes.
    The document.xml file is the main document of the package, which records the textual
    content and other relevant attributes of the Word XML file; in XML-based Office
    documents, the most basic unit is the element, which can carry several attributes
    and their values as additional information. It is on the basis of such attributes
    that encryption and decryption is performed.

    "Identification attributes" are generally used to distinguish data or attributes
    such as text, tables, etc., and are characterized by having unique and randomly
    generated attribute values that are not related to the user or the time of
    modification. Changes to the value of the identity attribute do not affect the
    content of the text. Therefore, the purpose of hiding information in Office XML
    documents can be achieved by writing the information to be hidden in the attribute
    values.

    * Created by: QiuYu
    * Created time: 2022/12/16

    Originally implemented in
    [1664587146/xml-word-Steganography](https://github.com/1664587146/xml-word-Steganography)
    """

    def __init__(self) -> None:
        super().__init__()

    def encode(self, _):
        raise NotImplementedError("Use encode_save_docx() instead")

    def encode_save_docx(self, carrier: str, payload: str, output_path: str) -> int:
        """Encoder requires carrier .docx and payload to be a string.

        Args:
            carrier: The path of carrier text in format docx.
            payload: Payload (secret message) to be encoded. Payload is a word.
            output_path: The path to save encoded carrier docx.

        Returns:
            lens: The length of payload.
        """
        with TemporaryDirectory() as tempdir:
            lens = methods.watermark_of_word(carrier, payload, output_path, tempdir)
            return lens

    def decode(self, _):
        raise NotImplementedError("This codec does not support decoding without length")

    def decode_with_length(self, carrier: str, payload_length: int) -> str:
        """Decode the secret payload from the carrier docx.

        Args:
            carrier: The path of encoded carrier docx.
            payload_length: The length of secret message

        Returns:
            key: The decoded payload (secret message).
        """
        with TemporaryDirectory() as tempdir:
            key = methods.decode_of_word(carrier, payload_length, tempdir)
            return key

encode_save_docx(carrier, payload, output_path)

Encoder requires carrier .docx and payload to be a string.

Parameters:

Name Type Description Default
carrier str

The path of carrier text in format docx.

required
payload str

Payload (secret message) to be encoded. Payload is a word.

required
output_path str

The path to save encoded carrier docx.

required

Returns:

Name Type Description
lens int

The length of payload.

Source code in stegobox/codec/word_stegano/word_stegano.py
def encode_save_docx(self, carrier: str, payload: str, output_path: str) -> int:
    """Encoder requires carrier .docx and payload to be a string.

    Args:
        carrier: The path of carrier text in format docx.
        payload: Payload (secret message) to be encoded. Payload is a word.
        output_path: The path to save encoded carrier docx.

    Returns:
        lens: The length of payload.
    """
    with TemporaryDirectory() as tempdir:
        lens = methods.watermark_of_word(carrier, payload, output_path, tempdir)
        return lens

decode_with_length(carrier, payload_length)

Decode the secret payload from the carrier docx.

Parameters:

Name Type Description Default
carrier str

The path of encoded carrier docx.

required
payload_length int

The length of secret message

required

Returns:

Name Type Description
key str

The decoded payload (secret message).

Source code in stegobox/codec/word_stegano/word_stegano.py
def decode_with_length(self, carrier: str, payload_length: int) -> str:
    """Decode the secret payload from the carrier docx.

    Args:
        carrier: The path of encoded carrier docx.
        payload_length: The length of secret message

    Returns:
        key: The decoded payload (secret message).
    """
    with TemporaryDirectory() as tempdir:
        key = methods.decode_of_word(carrier, payload_length, tempdir)
        return key