OpenOrCadParser icon indicating copy to clipboard operation
OpenOrCadParser copied to clipboard

more format details

Open chilauhe opened this issue 1 year ago • 3 comments

I'm doing similiar task recently and found this project, but I'm on pure C implentation of a standalone tool and don't rely on any third-party library. I already done compound file reading and writing, the next part is content file parsing.

After reading some code of the project, I found I cannot get the points clearly. Can someone shade more light on the OrCAD file format itself?

chilauhe avatar Feb 28 '23 08:02 chilauhe

Until now there is no documentation for the file format itself. I thought about creating one in a pure text format or Kaitai file a few times, however at the moment I don't understand a lot of parts in the format myself. Writing the documentation right now would result in a lot of effort to keep it up to date and refactor it. It's already a huge workload to change the code base when I realize that the file structure is different than what I had expected. Nonetheless, in the long run this project should also provide a detailed file format specification.

A few hints to get you started:

  • Streams are the parts that are contained inside the compound file, that you can already extract. If you want to know what a stream contains, take a look at the corresponding parser in the Streams folder
  • Structures are basic building blocks that repeat a lot in different streams
  • Primitives are similar to Structures but they deviate slightly in the way they are organized. They are mostly graphical elements.

Lets take a look at a short example. You extract an *.OLB file and find a folder named Packages. Parsing its content is performed by StreamPackage.

An excerpt of the code is printed below, I added a few comments describing what it does and removed unnecessary debug statements to make it more clear for you.

void StreamPackage::read(FileFormatVersion /* aVersion */)
{
    // `ds` stands for `DataStream` this is basically the Package file we currently parse
    auto& ds = mCtx.get().mDs.get();

    // Read 2 Byte from the file and interpret it as an unsigned integer
    // Further these 2 Byte define the number of properties that follow
    const uint16_t lenProperties = ds.readUint16();

    // Lets start to read all this properties
    for(size_t i = 0u; i < lenProperties; ++i)
    {
        // `readStructure` is a generic method to read arbitrary Structures therefore we
        // call it but knowing the structure must be a `StructProperties`, we cast it to that type
        // and store the result
        properties.push_back(dynamic_pointer_cast<StructProperties>(readStructure()));

        // Again a 2 Byte value specifying the number of primitives that follow it
        const uint16_t lenPrimitives = ds.readUint16();

        for(size_t i = 0u; i < lenPrimitives; ++i)
        {
            // The same as the outer loop but note that `StructPrimitives` is a `Structure` not
            // a `Primitive` according to my naming convention. It just contains a few primitives
            // therefore the name....
            primitives.push_back(dynamic_pointer_cast<StructPrimitives>(readStructure()));
        }
    }

    // I don't know a useful name for this structure therefore I named it after it's hexadecimal
    // value that represents the structure type -> Type 0x1F
    t0x1f = dynamic_pointer_cast<StructT0x1f>(readStructure());

    // The file must end here otherwise the parser is incorrect because we parsed too less bytes
    if(!ds.isEoF())
    {
        throw std::runtime_error("Expected EoF but did not reach it!");
    }
}

Whenever you see function calls prefixed with read, some data is read from the file. Just tracing those calls should give you some insight about the bytes in a file.

Werni2A avatar Feb 28 '23 17:02 Werni2A

I'm not sure if you want to implement a parser based on an existing file format documentation or rather reverse engineer the format yourself. In the later case, I'd like to encourage you to share your findings. Reverse engineering is a tedious process and we do not need to reinvent the wheel multiple times.

Werni2A avatar Feb 28 '23 19:02 Werni2A

Thank you for your information, I'd absolutely share my findings. My final aim is I can get schematic and BOM without OrCAD Capture. I don't want to reinvent the wheel, so this is a good start. I'm an EE enginner, the slow and buggy OrCAD Capture 16.6 waste me a lot of time, doing some research on it might help my work easlier and this is the source of my motivation. Maybe later the next week I can restart my research, this week I'm busy doing some formal project design. With some background of desktop software developing and reverse engineering, I belive it's just a matter of time.

chilauhe avatar Mar 04 '23 06:03 chilauhe