# faultreiber `faultreiber` generates a parser library in C for a structured file format. The input is an XML file that describes the format.
The C source code will be in the form of multiple source and header files. a makefile is also included.
The generated source code does not include a main.
## faultreiber XML file The root node should have two childs, named exactly `READ` and `DEFINITION`(order not important).
The `READ` node will include the actual structures that the parser will read and can return.
The `DEFINITION` node includes the definitions for the structures that are aggregate.
## Demo For a practical example, look at the example XML file under `resources`. The XML file describes the format of a WASM object file:
To run the demo, run `run.sh`, go to the `test` direcotory and run `make test` to run the executable.
To run `valgrind --leak-check=yes` run `make valgrind`.
## Memory Leaks Code generated by faultreiber should not leak any memory if everything went according to plan during code-gen. If that's not the case let me knkow.
## How to Use A function named `read_aggr_{name}` will be generated that takes an `int _fd` file descriptor for the file that it will read.`{name}` is what you pass to faultreiber with the `--name` option.
The return type will be a C structure with type `{name}_lib_ret_t`. The struct is defined as:
```C typedef struct { name_obj_t obj; void** void_train; uint64_t* current_void_size; uint64_t* current_void_count; }name_lib_ret_t ``` `{name}_obj_t` is a C structure defined in `aggregate.h` that holds all the read modules.
A function named `release_all_{name}` will be generated in `aggregate.c` that releases almost all the memory.
The proper order of realeasing the memory in the client code will be like below assuming the return value of `read_aggr_{name}` is stored in `lib_ret` and `--name` was passed a value of `wasm`:
```C release_all_wasm(lib_ret->void_train, lib_ret->current_void_count); free(lib_ret->obj); free(lib_ret); ``` ### Rules: Any child node of either `DEFINITION` or `READ` will have to at least have the attributes `name` and `type` defined. The presence of the attribute `count` is optional but if it's not present faultreiber will assume that the count is one.
The presence of the attribute `isaggregate` signifies the fact that the data structure is composed of other smaller parts. faultreiber will only read the children of a node that is the child of either the `DEFINITION` or `READ` node(unless a child node has the attribute `conditional` set). If a data structure requires more children then you should add a new node under `DEFINITION` and reference that node from it's parent. In other words, an aggregate node can't itself have child nodes that are aggregate.
`count`, `size`, `type` and `condition` attributes can reference a child node of the `DEFINITION` node. To do that, you should use `self::TAG`.
the tag names of the nodes that are on the same level should be unique. The `name` attribute of the nodes on the same level need to be unique as well.
The order of the nodes that appear as children of the `DEFINITION` node, even when the child nodes are referencing each other, is unimportant to faultreiber.
Tags should follow the naming convention for naming XML nodes. The `name` attributes should follow the C identifier naming convention(if the value of the `name` attribute is invalid in C as as identifier you're going to end up with code that won't even build).
The following values are valid values for the `type` attribute:
* int8 * uint8 * int16 * uint16 * int32 * uint32 * int64 * uint64 * int128 * uint128 * float * double * string * FT::conditional * self::TAG For string nodes, the node should either have a non-empty `size` attribute or have a `delimiter` attribute. In case a `delimiter` attribute is selected the value of the delimiter should be provided as the value of the `delimiter` attribute to the node.
Strings read through a `delimiter` node will have their delimiter attached to the end of the string(null-terminated or otherwise). String reads that have a `size` attribute will be forcefully null-terminated even if the original string was not null-terminated.
Child nodes of `READ` node that have the `unordered` attribute set, will be regarded as such, meaning they can appear in the file sporaically. Such nodes will have to have a child node with attriute `sign`.The value of the sign attribute is used to check for the presence of the parent node in the file.
`unorderedbegin` and `unorderedend` attributes denote the begenning and end of an unordered section in the `READ` node. For every unordered section, only one node needs to define the begin and end attributes. All the other nodes, including the nodes that define the `unorderedbegin` and `unorderedend` attributes, shall have the `unordered` attribute defined.
Any child of the `READ` node that is not inside an unordered block or doesnt have the `unordered` attribute set, will be regarded as ordered.
Whether `int128` or `uint128` are defined depends on your the C implementation you are using on your host. If 128-bit integers are not supported or you need to read in bigger integers, you can simply use a smaller int type and increase the `count` attribute accordingly.
The `FT::conditional` tag for a type means that the actual content of the node will depend on a value. The attribute `condition` will provide what that condition is. The value for the condition should be provided as text for the different nodes that define what the actual contents should be.
`size` attribute is currently only meaningful when the `type` attribute is set as `string` in which case it denotes the size of the string.
## Options ```bash -h, --help show this help message and exit --targetname TARGETNAME main target name --outdir OUTDIR path to output dir --structs STRUCTS the structs json file --structsinclude STRUCTSINCLUDE the path to the header that's going to be included by structs.h before structure declarations. --xml XML paht to the xml file --dbg debug --datetime print date and time in autogen files --inline inlines reader funcs --static statics reader funcs --verbose verbose --forcenullterm terminate all strings with null even if they are not originally null-terminated --strbuffersize STRBUFFERSIZE the size of the buffer for string reads --strbuffgrowfactor STRBUFFGROWFACTOR the factor by which the strbuffer will grow --voidbuffersize VOIDBUFFERSIZE the size of the buffer for void* buffer --voidbuffgrowfactor VOIDBUFFGROWFACTOR the factor by which the voidbuffer will grow --singlefile the generated code will be put in a single file --singlefilename SINGLEFILENAME name of the single file --name will be used in generating some code identifiers ``` ## limitations Big-Endian reads are not supported.
None-byte-sized raw reads are not supported.
## makefile That would be on you but there is an example makefile in the `test` directory so you can take a look if you want.
To get a list of the targets the makfile supports you can run its `help` target.
## TODO All the items under limitations.
Figure out what the license of the generated code is.
## Projects The list of the projects that use faulreiber:
* [bruiser](https://github.com/bloodstalker/mutator/tree/master/bruiser)
## License `faultreiber` along with the makefile, are provided under MIT. I'm not sure whether the generated code is considered "derived work", but if it is, then the generated code will also fall under MIT