OPS format documentation?

  • Videogamer555
    14th Nov 2013 Member 0 Permalink
    What is the new OPS format? What is each header field in the file? As for the compressed data, after it is uncompressed, what is contained in each section of the data? Can someone post a reply to my thread, containing the full documentation of the OPS format you are using? And please don't direct me to read the source code. I suck at determining specs on a format when reading C or C++ code. I'm more used to a nice table or list layout like:
    Bytes 1 to 4 are this
    and Byte 5 is that,
    and Bytes 6 and 7 are the other thing.

    That's how I'm used to reading specs on a file format. So I hope there's some official format docs on TPT's OPS format, but I've not found any. In the event that no such docs exist, could someone who IS proficient at reading C and C++ code, read the portion of TPT's code pertaining to saves and stamps, and write up a nice table layout of the file format used by the OPS saves in the latest version of TPT. Something like the specs on the MS WAV audio file format at https://ccrma.stanford.edu/courses/422/projects/WaveFormat/

    where the format is spelled out in plain English.
    Edited 6 times by Videogamer555. Last: 14th Nov 2013
  • mniip
    14th Nov 2013 Developer 0 Permalink
  • Videogamer555
    14th Nov 2013 Member 0 Permalink
    How do I read this into an understandable format, for use in other programming languages? I happen to be using VB6. And I can't make heads or tails of the C++ code you posted. Giving up on the C++ code, I've decided to reverse engineer the structure of the file the way any good hacker would, through a hex editor, and knowing a bit about what kinds of data must be present in the save file.

    I finally figured out (through looking at a hex editor) enough of the header to be able to get the uncompressed size of the compressed data, and then uncompress the data. But now looking at the uncompressed game data, I see this in my hex editor in the ascii text display:
    Åä...origin.‚....majorVersion.Y....minorVersion......buildNum......snapshotId......releaseType.....R..platform.....WIN32..builtType.....SSE2...waterEEnabled...legacyEnable...gravityEnable...aheat_enable...paused...gravityMode......airMode......parts.¶!..€.

    And this in the corresponding part of the hexadecimal display:
    C5 E4 0D 00 03 6F 72 69 67 69 6E 00 82 00 00 00 10 6D 61 6A 6F 72 56 65 72 73 69 6F 6E 00 59 00 00 00 10 6D 69 6E 6F 72 56 65 72 73 69 6F 6E 00 00 00 00 00 10 62 75 69 6C 64 4E 75 6D 00 13 01 00 00 10 73 6E 61 70 73 68 6F 74 49 64 00 00 00 00 00 02 72 65 6C 65 61 73 65 54 79 70 65 00 02 00 00 00 52 00 02 70 6C 61 74 66 6F 72 6D 00 06 00 00 00 57 49 4E 33 32 00 02 62 75 69 6C 74 54 79 70 65 00 05 00 00 00 53 53 45 32 00 00 08 77 61 74 65 72 45 45 6E 61 62 6C 65 64 00 00 08 6C 65 67 61 63 79 45 6E 61 62 6C 65 00 00 08 67 72 61 76 69 74 79 45 6E 61 62 6C 65 00 01 08 61 68 65 61 74 5F 65 6E 61 62 6C 65 00 00 08 70 61 75 73 65 64 00 01 10 67 72 61 76 69 74 79 4D 6F 64 65 00 00 00 00 00 10 61 69 72 4D 6F 64 65 00 00 00 00 00 05 70 61 72 74 73 00 B6 21 03 00 80 AD

    So far, what I gather is that the first 4 bytes is a long integer that is the size of the uncompressed data (a repeat of the last entry in the header of the file). The next byte is a mystery though. It has the value 0x03. Immediately after that is an ascii string. It would seem logical that the value preceading the string would be the length of the string. However this value would indicate a length of 3, but the string "origin" has a length of 6. And it's not just as simple as the stored value being half the actual string length. Other strings have no numerical corresponding at all. And what's stranger is after each ascii string there is some small ammount of binary data usually about 4 to 6 bytes in size. That is up until the point of the string "parts", after which there is a large array (I assume particle data), but in what format I have no clue.

    This is why I need a FULLY DOCUMENTED specification on the save file format, in plain English, not C++ code. I only hope that whoever the had dev of TPT is, he will get around to looking at this thread, and provide me the full specs in English.
    Edited once by Videogamer555. Last: 14th Nov 2013
  • boxmein
    14th Nov 2013 Former Staff 0 Permalink
    You have 12 bytes of magic and then a bzip2-compressed buffer of data:

    // BSON DATA
    {
    "origin": {
    "majorVersion": integer SAVE_VERSION, // current TPT version that matters to the save server, eg 83
    "minorVersion": integer MINOR_VERSION, // TPT minor version - after the dot, eg 0
    "buildNum": integer BUILD_NUM, // TPT build version, always goes up, eg 272
    "snapshotId": integer SNAPSHOT_ID, // TPT snapshot version (rarely used), eg 1346881831
    "releaseType": integer IDENT_PLATFORM, //TPT release type eg "WIN32"
    "builtType": integer IDENT_BUILD // Build type eg "SSE3"
    },

    "waterEEnabled": boolean waterEEnabled, // Water equalisation
    "legacyEnable": boolean legacyEnable, // Legacy mode - no heat?
    "gravityEnable": boolean gravityEnable, // Newtonian Gravity
    "aheat_enable": boolean aheat_enable, // Ambient Heat
    "paused": boolean paused, // paused?
    "gravityMode": integer gravityMode, // Vertical/Radial/No gravity
    "airMode": integer airMode, // Off/No pressure/No velocity/No update/All on

    // "leftSelectedElement": integer sl,
    // "rightSelectedElement": integer sr,
    // "activeMenu": int active_menu,

    // following may or may not appear on the save file:

    // https://github.com/simtr/The-Powder-Toy/blob/master/src/client/GameSave.cpp#L1833-L1995
    // Everything relating to particles. Make sure to check out the code!
    "parts": binary,

    // https://github.com/simtr/The-Powder-Toy/blob/master/src/client/GameSave.cpp#L1819-L1831
    // Particle stacking array, check out the code here too
    "partsPos": binary,

    // https://github.com/simtr/The-Powder-Toy/blob/master/src/client/GameSave.cpp#L1743-L1783
    // Wall map. Check out the code.
    "wallMap": binary,

    // https://github.com/simtr/The-Powder-Toy/blob/master/src/client/GameSave.cpp#L1761-L1771
    // A special wall map for special fans
    "fanMap": binary,

    // https://github.com/simtr/The-Powder-Toy/blob/master/src/client/GameSave.cpp#L1997-L2042
    // Even more special map for soap connection points. Make sure to read the code.
    "soapLinks": binary,

    // https://github.com/simtr/The-Powder-Toy/blob/master/src/client/GameSave.cpp#L2077-L2085
    // Element Name/ID pairs as PaletteItems
    "<element name>": <element ID>,

    // https://github.com/simtr/The-Powder-Toy/blob/master/src/client/GameSave.cpp#L2086-L2111
    // Signs
    "signs": [
    // (each sign adds an object)
    {
    "text": string signs[i].text.c_str(),
    "justification": int signs[i].ju,
    "x": int signs[i].x,
    "y": int signs[i].y
    }
    // ...
    ]
    }

    // SAVE BYTES
    // Length of file: BSON data dump size * 2 + 12 bytes
    [0]-[3]: 'OPS1',
    [4]: SAVE_VERSION,
    [5]: CELL, // 4
    [6]: blockW,
    [7]: blockH,
    [8]: finalDataLen // size of the BSON data structure up above
    [9]: finalDataLen >> 8
    [10]: finalDataLen >> 16
    [11]: finalDataLen >> 24

    [12]-: bzip2-compressed dump of the BSON object
    /*
    BZ2_bzBuffToBuffCompress( (char*)(outputData+12),
    &outputDataLen,
    (char*) finalData,
    bson_size(&b),
    9,
    0,
    0)
    */
    Edited once by boxmein. Last: 14th Nov 2013
  • jacksonmj
    14th Nov 2013 Developer 0 Permalink

    To translate the uncompressed bytes into the format given by @boxmein (View Post), see http://bsonspec.org/#/specification

    To understand that, you'll need to know how to read BNF specifications, which might not be verbose English but are a detailed explanation of how to interpret data, and are pretty easy to comprehend once you're familiar with them.

     

    I'm not going to give a detailed explanation in English of BSON. However, to help you check your understanding of the BNF for it, here is part of the hexadecimal you posted, interpreted according to the specification (compare it to the structure shown in boxmein's post):

    C5 E4 0D 00 : int32, length of document

      03 : start of embedded document

      6F 72 69 67 69 6E 00 : name of embedded document as a 0 terminated string ("origin").

      82 00 00 00 : int32, length of embedded document

         followed by the elements in the embedded document

         10 : type of first element = int32

         6D 61 6A 6F 72 56 65 72 73 69 6F 6E 00 : name of first element as a 0 terminated string ("majorVersion").

         59 00 00 00: value of first element = 89

         10 : type of second element = int32

         6D 69 6E 6F 72 56 65 72 73 69 6F 6E 00 : name of second element as a 0 terminated string ("minorVersion").

         00 00 00 00 : value of second element = 0

     

     

    Interpretation of the binary data in parts and partsPos would be a nice thing to have documented, though personally I find the readOPS function pretty straightforward to follow (https://github.com/simtr/The-Powder-Toy/blob/master/src/client/GameSave.cpp#L698-L1033 as opposed to serialiseOPS in the links from boxmein and mniip, since the read function is probably more useful if you want to know how to read a save file).

    Edited once by jacksonmj. Last: 14th Nov 2013
  • boxmein
    14th Nov 2013 Former Staff 0 Permalink
    @jacksonmj (View Post)
    Hmm.. I should improve this and make a cross-reference page somewhere.
  • Videogamer555
    14th Nov 2013 Member 0 Permalink
    What about the last part with 05 70 61 72 74 73 00 B6 21 03 00 80
    05 means data type is binary
    70 61 72 74 73 00 is the null terminated string "parts"
    B6 21 03 00 is the length of the binary data
    And I assumed that the binary data started IMMEDIATELY after that, but it doesn't. If it did, the data doesn't properly line up with the next entry. To make it line up I have to assume the next byte is not part of the data, but rather part of the header.

    80 means I have no clue what, but it appears to be part of the header, because the data doesn't line up if the 80 is assumed to be part of the data, and only lines up if it is part of the header.

    Please explain. This is VERY CRUCIAL in understanding the general BSON format, such that I can make a reader for it, for the purpose of decoding TPT OPS1 saves in VB6.
  • jacksonmj
    14th Nov 2013 Developer 0 Permalink

    A binary element has the form:

      "\x05" e_name binary

    = (05) (70 61 72 74 73 00) (B6 21 03 00 80 ....)

     

    The 'binary' part of that has the form:

      binary ::= int32 subtype (byte*)

    = (B6 21 03 00) (80) (....)

     

    The 0x80 byte is therefore the 'subtype'. The (byte*) that follows is the actual data ((byte*) in BNF means 0 or more bytes, not a pointer as it might in C/C++).

      subtype ::= "\x00"     Binary / Generic
        |     "\x01"     Function
        |     "\x02"     Binary (Old)
        |     "\x03"     UUID (Old)
        |     "\x04"     UUID
        |     "\x05"     MD5
        |     "\x80"     User defined

     

    subtype is a description of the binary data, so if storing a MD5 hash as a binary element, subtype could be 0x05. Some subtypes have an effect on how the subsequent bytes should be interpreted (hover mouse over the (i) symbols in the specification), but 0x80 means "the structure of the binary data can be anything". So yes, the 0x80 is part of the header and the subsequent bytes are the data.

    Edited 6 times by jacksonmj. Last: 14th Nov 2013
  • Videogamer555
    14th Nov 2013 Member 0 Permalink
    This makes no sense:
    82 00 00 00 : int32, length of embedded document

    0x82 is decimal number 130. There is NO WAY that the length of the embedded document (the save file) contains only 130 bytes. This number must be wrong. If not, please explain.



    Also, someone should make a manual with all the format specs pertaining to TPT, so that someone can easilly make 3rd-party save file reader/writer software.
    I mean look at all the specs here.

    Save file:
    Save file header
    BZ2 compressed data

    After decompression, data contains:
    BSON format

    BSON format contains:
    Major version
    Minor version
    lots of other stuff all the way down to
    Parts
    PartsPos

    Parts format = proprietary (undocumented, why's nobody documented this)
    PartsPos format = proprietary (undocumented, why's nobody documented this)



    ALL OF THIS STUFF should be put into a single word document, html document, or pdf document, called "This is all the format info you will need to make 3rd party software for reading and writing TPT save and stamp files".

    Nobody should have to go through hell just to find a bunch of different specs, scattered all around the web, each one not fully self contained, but instead making references to other specs that then have to be separately searched for, until you've FINALLY gathered enough info to make a piece of software to read and/or write TPT save and stamp files.
    Edited once by Videogamer555. Last: 14th Nov 2013
  • jacob1
    14th Nov 2013 Developer 0 Permalink
    how is it proprietary? If you can understand hex editors and WAV formats, i'm sure you can understand the easily readable and commented c++ code that reads the file.

    90% of people that want to read tpt saves do it in a better way. Either they use c++ and copy the code directly (easiest way), or they at least use a BSON lib. You are attempting to read two unrelated formats at once, that is the problem here. Maybe you should work on a BSON parser first before you even start trying to work on reading the tpt save format.

    Or, even better, don't use VB6. I don't know what exactly you are attempting to do, but VB6 isn't really the way to do it. That's just a beginners language they teach you in school so you can start to learn some programming concepts, but nobody in real life actually uses it ...