# faultreiber
[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fbloodstalker%2Ffaultreiber.svg?type=shield)](https://app.fossa.io/projects/git%2Bgithub.com%2Fbloodstalker%2Ffaultreiber?ref=badge_shield)
`faultreiber` generates a parser library in C for a structured (binary) file format. The input is an XML file that describes the format.
The C source code will be in the form of multiple source and header files.
The headers have header guards and are already `extern "C"`ed.
## Demo
For a practical example, look at the example XML file under `resources`. The XML file describes the format of a WASM object file:
To run the demo, run `run.sh`, go to the `test` direcotory and run `make test` to run the executable.
To run `valgrind --leak-check=yes` run `make valgrind`.
## Memory Leaks
Code generated by faultreiber should not leak any memory if everything went according to plan during code-gen. If that's not the case let me knkow.
## How to Use
A function named `read_aggr_{name}` will be generated that takes an `int _fd` file descriptor for the file that it will read.`{name}` is what you pass to faultreiber with the `--name` option.
The return type will be a C structure with type `{name}_lib_ret_t`. The struct is defined as:
```C
typedef struct {
name_obj_t obj;
void** void_train;
uint64_t* current_void_size;
uint64_t* current_void_count;
}name_lib_ret_t
```
`{name}_obj_t` is a C structure defined in `aggregate.h` that holds all the read modules.
A function named `release_all_{name}` will be generated in `aggregate.c` that releases almost all the memory.
The proper order of realeasing the memory in the client code will be like below assuming the return value of `read_aggr_{name}` is stored in `lib_ret` and `--name` was passed a value of `wasm`:
```C
release_all_wasm(lib_ret->void_train, lib_ret->current_void_count);
free(lib_ret->obj);
free(lib_ret);
```
## faultreiber XML file
The root node should have two childs, named exactly `READ` and `DEFINITION`(order not important).
The `READ` node will include the actual structures that the parser will read and can return.
The `DEFINITION` node includes the definitions for the structures that are aggregate.
## Rules:
Any child node of either `DEFINITION` or `READ` will have to at least have the attributes `name` and `type` defined. The presence of the attribute `count` is optional but if it's not present faultreiber will assume that the count is one.
The presence of the attribute `isaggregate` signifies the fact that the data structure is composed of other smaller parts. faultreiber will only read the children of a node that is the child of either the `DEFINITION` or `READ` node(unless a child node has the attribute `conditional` set). If a data structure requires more children then you should add a new node under `DEFINITION` and reference that node from it's parent. In other words, an aggregate node can't itself have child nodes that are aggregate.
`count`, `size`, `type` and `condition` attributes can reference a child node of the `DEFINITION` node. To do that, you should use `self::TAG`.
the tag names of the nodes that are on the same level should be unique. The `name` attribute of the nodes on the same level need to be unique as well.
The order of the nodes that appear as children of the `DEFINITION` node, even when the child nodes are referencing each other, is unimportant to faultreiber.
Tags should follow the naming convention for naming XML nodes. The `name` attributes should follow the C identifier naming convention(if the value of the `name` attribute is invalid in C as as identifier you're going to end up with code that won't even build).
The following values are valid values for the `type` attribute:
* int8
* uint8
* int16
* uint16
* int32
* uint32
* int64
* uint64
* int128
* uint128
* float
* double
* string
* FT::conditional
* self::TAG
For string nodes, the node should either have a non-empty `size` attribute or have a `delimiter` attribute. In case a `delimiter` attribute is selected the value of the delimiter should be provided as the value of the `delimiter` attribute to the node.
Strings read through a `delimiter` node will have their delimiter attached to the end of the string(null-terminated or otherwise). String reads that have a `size` attribute will be forcefully null-terminated even if the original string was not null-terminated.
Child nodes of `READ` node that have the `unordered` attribute set, will be regarded as such, meaning they can appear in the file sporaically. Such nodes will have to have a child node with attriute `sign`.The value of the sign attribute is used to check for the presence of the parent node in the file.
`unorderedbegin` and `unorderedend` attributes denote the begenning and end of an unordered section in the `READ` node. For every unordered section, only one node needs to define the begin and end attributes. All the other nodes, including the nodes that define the `unorderedbegin` and `unorderedend` attributes, shall have the `unordered` attribute defined.
Any child of the `READ` node that is not inside an unordered block or doesnt have the `unordered` attribute set, will be regarded as ordered.
Whether `int128` or `uint128` are defined depends on your the C implementation you are using on your host. If 128-bit integers are not supported or you need to read in bigger integers, you can simply use a smaller int type and increase the `count` attribute accordingly.
The `FT::conditional` tag for a type means that the actual content of the node will depend on a value. The attribute `condition` will provide what that condition is. The value for the condition should be provided as text for the different nodes that define what the actual contents should be.
`size` attribute is currently only meaningful when the `type` attribute is set as `string` in which case it denotes the size of the string.
## Options
```bash
-h, --help show this help message and exit
--targetname TARGETNAME
main target name
--outdir OUTDIR path to output dir
--structs STRUCTS the structs json file
--structsinclude STRUCTSINCLUDE
the path to the header that's going to be included by
structs.h before structure declarations.
--xml XML paht to the xml file
--dbg debug
--datetime print date and time in autogen files
--inline inlines reader funcs
--static statics reader funcs
--verbose verbose
--forcenullterm terminate all strings with null even if they are not
originally null-terminated
--strbuffersize STRBUFFERSIZE
the size of the buffer for string reads
--strbuffgrowfactor STRBUFFGROWFACTOR
the factor by which the strbuffer will grow
--voidbuffersize VOIDBUFFERSIZE
the size of the buffer for void* buffer
--voidbuffgrowfactor VOIDBUFFGROWFACTOR
the factor by which the voidbuffer will grow
--singlefile the generated code will be put in a single file
--singlefilename SINGLEFILENAME
name of the single file
--name will be used in generating some code identifiers
```
## limitations
Big-Endian reads are not supported.
None-byte-sized raw reads are not supported.
## makefile
That would be on you but there is an example makefile in the `test` directory so you can take a look if you want.
You can also get generic ones from [here](https://github.com/bloodstalker/lazymakefiles). They're licensed under the Unlicense.
## TODO
All the items under limitations.
Figure out what the license of the generated code is.
## Projects
The list of the projects that use faulreiber:
* [bruiser](https://github.com/bloodstalker/mutator/tree/master/bruiser)
## License
`faultreiber` is provided under MIT. I'm assuming(I'm not a lawyer) the generated code is considered "derived work". If it is, then the generated code will also fall under MIT.
[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fbloodstalker%2Ffaultreiber.svg?type=large)](https://app.fossa.io/projects/git%2Bgithub.com%2Fbloodstalker%2Ffaultreiber?ref=badge_large)