# faultreiber [![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fbloodstalker%2Ffaultreiber.svg?type=shield)](https://app.fossa.io/projects/git%2Bgithub.com%2Fbloodstalker%2Ffaultreiber?ref=badge_shield) `faultreiber` generates a parser library in C for a structured (binary) file format. The input is an XML file that describes the format.
The C source code will be in the form of multiple source and header files.
The headers have header guards and are already `extern "C"`ed.
## Demo For a practical example, look at the example XML file under `resources`. The XML file describes the format of a WASM object file:
To run the demo, run `run.sh`, go to the `test` direcotory and run `make test` to run the executable.
To run `valgrind --leak-check=yes` run `make valgrind`.
## Memory Leaks Code generated by faultreiber should not leak any memory if everything went according to plan during code-gen. If that's not the case let me knkow.
## How to Use A function named `read_aggr_{name}` will be generated that takes an `int _fd` file descriptor for the file that it will read.`{name}` is what you pass to faultreiber with the `--name` option.
The return type will be a C structure with type `{name}_lib_ret_t`. The struct is defined as:
```C typedef struct { name_obj_t obj; void** void_train; uint64_t* current_void_size; uint64_t* current_void_count; }name_lib_ret_t ``` `{name}_obj_t` is a C structure defined in `aggregate.h` that holds all the read modules.
A function named `release_all_{name}` will be generated in `aggregate.c` that releases almost all the memory.
The proper order of realeasing the memory in the client code will be like below assuming the return value of `read_aggr_{name}` is stored in `lib_ret` and `--name` was passed a value of `wasm`:
```C release_all_wasm(lib_ret->void_train, lib_ret->current_void_count); free(lib_ret->obj); free(lib_ret); ``` ## faultreiber XML file The root node should have two childs, named exactly `READ` and `DEFINITION`(order not important).
The `READ` node will include the actual structures that the parser will read and can return.
The `DEFINITION` node includes the definitions for the structures that are aggregate.
## Rules: Any child node of either `DEFINITION` or `READ` will have to at least have the attributes `name` and `type` defined. The presence of the attribute `count` is optional but if it's not present faultreiber will assume that the count is one.
The presence of the attribute `isaggregate` signifies the fact that the data structure is composed of other smaller parts. faultreiber will only read the children of a node that is the child of either the `DEFINITION` or `READ` node(unless a child node has the attribute `conditional` set). If a data structure requires more children then you should add a new node under `DEFINITION` and reference that node from it's parent. In other words, an aggregate node can't itself have child nodes that are aggregate.
`count`, `size`, `type` and `condition` attributes can reference a child node of the `DEFINITION` node. To do that, you should use `self::TAG`.
the tag names of the nodes that are on the same level should be unique. The `name` attribute of the nodes on the same level need to be unique as well.
The order of the nodes that appear as children of the `DEFINITION` node, even when the child nodes are referencing each other, is unimportant to faultreiber.
Tags should follow the naming convention for naming XML nodes. The `name` attributes should follow the C identifier naming convention(if the value of the `name` attribute is invalid in C as as identifier you're going to end up with code that won't even build).
The following values are valid values for the `type` attribute:
* int8 * uint8 * int16 * uint16 * int32 * uint32 * int64 * uint64 * int128 * uint128 * float * double * string * FT::conditional * self::TAG For string nodes, the node should either have a non-empty `size` attribute or have a `delimiter` attribute. In case a `delimiter` attribute is selected the value of the delimiter should be provided as the value of the `delimiter` attribute to the node.
Strings read through a `delimiter` node will have their delimiter attached to the end of the string(null-terminated or otherwise). String reads that have a `size` attribute will be forcefully null-terminated even if the original string was not null-terminated.
Child nodes of `READ` node that have the `unordered` attribute set, will be regarded as such, meaning they can appear in the file sporaically. Such nodes will have to have a child node with attriute `sign`.The value of the sign attribute is used to check for the presence of the parent node in the file.
`unorderedbegin` and `unorderedend` attributes denote the begenning and end of an unordered section in the `READ` node. For every unordered section, only one node needs to define the begin and end attributes. All the other nodes, including the nodes that define the `unorderedbegin` and `unorderedend` attributes, shall have the `unordered` attribute defined.
Any child of the `READ` node that is not inside an unordered block or doesnt have the `unordered` attribute set, will be regarded as ordered.
Whether `int128` or `uint128` are defined depends on your the C implementation you are using on your host. If 128-bit integers are not supported or you need to read in bigger integers, you can simply use a smaller int type and increase the `count` attribute accordingly.
The `FT::conditional` tag for a type means that the actual content of the node will depend on a value. The attribute `condition` will provide what that condition is. The value for the condition should be provided as text for the different nodes that define what the actual contents should be.
`size` attribute is currently only meaningful when the `type` attribute is set as `string` in which case it denotes the size of the string.
## Options ```bash -h, --help show this help message and exit --targetname TARGETNAME main target name --outdir OUTDIR path to output dir --structs STRUCTS the structs json file --structsinclude STRUCTSINCLUDE the path to the header that's going to be included by structs.h before structure declarations. --xml XML paht to the xml file --dbg debug --datetime print date and time in autogen files --inline inlines reader funcs --static statics reader funcs --verbose verbose --forcenullterm terminate all strings with null even if they are not originally null-terminated --strbuffersize STRBUFFERSIZE the size of the buffer for string reads --strbuffgrowfactor STRBUFFGROWFACTOR the factor by which the strbuffer will grow --voidbuffersize VOIDBUFFERSIZE the size of the buffer for void* buffer --voidbuffgrowfactor VOIDBUFFGROWFACTOR the factor by which the voidbuffer will grow --singlefile the generated code will be put in a single file --singlefilename SINGLEFILENAME name of the single file --name will be used in generating some code identifiers ``` ## limitations Big-Endian reads are not supported.
None-byte-sized raw reads are not supported.
## makefile That would be on you but there is an example makefile in the `test` directory so you can take a look if you want.
You can also get generic ones from [here](https://github.com/bloodstalker/lazymakefiles). They're licensed under the Unlicense.
## TODO All the items under limitations.
Figure out what the license of the generated code is.
## Projects The list of the projects that use faulreiber:
* [bruiser](https://github.com/bloodstalker/mutator/tree/master/bruiser)
## License `faultreiber` is provided under MIT. I'm assuming(I'm not a lawyer) the generated code is considered "derived work". If it is, then the generated code will also fall under MIT.
[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fbloodstalker%2Ffaultreiber.svg?type=large)](https://app.fossa.io/projects/git%2Bgithub.com%2Fbloodstalker%2Ffaultreiber?ref=badge_large)