dunham 1 hours ago [-]
Nice work, thanks for taking the time to write it up. I regret not doing that for my projects.

I also did something similar back around 2014 in https://github.com/dunhamsteve/iwork but I didn't get much further than tables on the Numbers side before taking a break. There I translated iwork files to HTML. That code has been largely neglected since then, and I never wrote up my process. Like the other commenter, I based this on https://github.com/obriensp/iWorkFileFormat

For ObjC programs that don't embed the descriptors, I wrote a python script that reverse engineers protobuf schemas from disassembled code: https://gist.github.com/dunhamsteve/224e26a7f56689c33cea4f0f... I don't remember what project that was for, but maybe it's useful to someone.

And for Notes.app, I reverse engineered the description from the binary protobuf data. Since there is ambiguity between binary data and nested objects, my script would build a tentative schema and then refine it against further examples. I later learned that the full schema, in text form, was embedded in the web version of the application. That project is at https://github.com/dunhamsteve/notesutils and also is neglected. I believe the table format has changed enough that tables are no longer working.

3 hours ago [-]
psobot 3 hours ago [-]
Nice work! I had the same fun RE adventure in https://github.com/psobot/keynote-parser a couple years back, based on Sean Patrick O'Brien's work back in 2013: https://github.com/obriensp/iWorkFileFormat/blob/master/Docs...
mackross 8 hours ago [-]
Amazing work by author!