Wednesday, April 19, 2023

FuseSoC 2.2




Do you know the best way to find out who is using your open source software? Introduce bugs! You will suddenly come in touch with a lot of users you didn't know existed. And let's just say I found out about a lot of new users after the release of FuseSoC 2.1. And with FuseSoC 2.1 having a lot of new features, it's perhaps not too surprising that the odd bug crept in.

But enough about that, because FuseSoC 2.2, the topic for today, has hopefully fixed what was broken. And of course we have a couple of new features as well, even though the list is somewhat shorter than usual. But let's see what the new version has to offer

JSON Schema

Generally, I'm pretty happy about the code quality of FuseSoC. It has proven to be relatively friendly to new contributors and has gone through a couple of major refactorings without too much problems over its almost 12 years of existence. But there is one part of the code base that I usually try to stay clear from.

Deep inside of the FuseSoC code base there is a yaml structure encoded inside a Python string that is parsed when the module is imported to dynamically create a tree of Python classes which are then used to recursively read and validate core description files. Pretty clever, right? This is a fantastic example of the sort of thing that seems great because it's possible and not too hard to actually do with Python. Now, the thing is, because of the cleverness of the code, it is pretty much unreadable even for me who wrote it. Every time I need to fix some bug in this area of the code I end up spending hours trying to figure out how it all works, all the time crying and asking why oh why I built it like this in the first place.

So what little time was saved on writing some more verbose code, we pay for over and over again in maintenance. Not to mention all the bizarre corner cases that arises because the code is trying to outsmart itself. Things that required us to create classes like this:

The time was ripe now to rework this whole thing into something more sensible. So what we do instead now is to have a JSON Schema definition of the CAPI2 format...encoded as a string in a Python module deep inside the FuseSoC code base. I understand this doesn't look all that much like an improvement, but it's the first, and most important, step of a journey.


Short-term this leads to a more maintainable parser and validator because we only need to care about the definition. There is battle-proven Python code already that does the actual validation and is better at pointing out where in a core description file there is an error. There are also other utilities for generating documentation to offload this from FuseSoC itself. The parsing is also a bit more consistent now and supports use flag expansion in more places.

But long-term, this paves the road for actually splitting out the CAPI2 definition to its own project that can be readily reused by other tools without having to use FuseSoC. Having the validation code in jsonschema allows for much easier reimplementations and utilities written in other languages than Python. The first case that comes to mind is JavaScript for having web-based utilities around CAPI2 or built-in validation of core description files in e.g. VS code. But it also makes it easier to implement support for CAPI2 files directly in EDA tools written in Java, C++ or why not Rust.

It should be noted that the new parser is a bit more strict than the old one, so it might complain on files that were previously deemed ok. Hopefully there shouldn't be too many of those. There's also a new command-line switch --allow-additional-properties that can be turned on to make the parser more relaxed towards elements in the core description files that it doesn't know about.

Tags

The other thing I want to mention in this release is a small code change that I think will have open up for more use cases. It's now possible to set tags for files or filesets, very much like we can set file_type or logical_name today. FuseSoC itself doesn't care about the tags, but they are passed on to Edalize through the EDAM file. In Edalize, since version 0.5.0 we have begun to look at tags in some of the flows and take decisions upon them. The only tag that is recognized today is the "simulation" tag, that can be set on HDL files to indicate they are intended for simulation and not for synthesis. This change opens up for use-cases such as gate-level simulation where we first send our code through a synthesis tool and then the created netlist is simulated together with a testbench. By marking the testbench files with simulation, we tell the synthesis tools to not try to synthesize them into the netlist but instead pass them on to the simulator untouched. Another future use-case is for TCL files. There might be many tools in a tool flow that parses TCL files and so far, there hasn't been a way to tell the backend for which tool a particular TCL file is intended. I suspect we will see a whole bunch of more use-cases in the future.

Other things

I mentioned some bugs, right? A big one was that users of the old tool API (which I believe is still most users) noticed that the FPGA image or simulation model was not rebuilt when source files were changed. The new flow API has some properties that allows us to track changes in a much better way and avoids unnecessary rebuilds in many cases. Unfortunately, when these changes were made we didn't properly test how that affected the tool API.

Another issue was reported from users who uses --no-export together with generators. The recently introduced caching mechanism forced us to rewrite much of the code around generators and unfortunately we ended up missing this case, where the generated code got removed before it was used. Whoops. Also fixed now.

All in all, I hope you enjoy the new features and the new release. Happy FuseSoCing!

No comments:

Post a Comment