Writing a digital logic simulator - Part 6

A modular redesign.

Introduction

Links to previous posts: Part 1 Part 2 Part 3 Part 4 Part 5

In the last post we added a parser to the simulator, so maybe it’s time to structure the code.

Modules

The files of the simulator are:

build.rs
Cargo.toml
src/
    main.rs
    emit_json.rs
    parser.rs
    comphdl1.lalrpop

We will remove most logic from the main.rs file, and create the following modules:

src/
    main.rs
    emit_json.rs
    parser.rs
    bit.rs
    component.rs
    simulation.rs

Since I always forget how to do modules in Rust, you need to declare them in the main.rs or lib.rs file:

// main.rs
mod bit;

And then you can either create a bit.rs file in the same directory, or create a bit/ directory and add a bit/mod.rs file.

Let’s see the result.

The main function parses the command-line arguments:

fn main(){
    // Usage: cargo run (for default arguments)
    //        cargo run -- test.txt Buf123 (filename, component name)
    use std::env;
    let mut args = env::args();
    let _program_name = args.next().unwrap();
    let filename = args.next().unwrap_or(format!("test.txt"));
    let top = args.next().unwrap_or(format!("Demux_1_4"));
    parse_file(&filename, &top);
}
fn parse_file(filename: &str, top: &str) {
    // Create gate
    let mut gate = parser::parse_file(filename, top);

    // Run simulation
    let mut buf = Vec::with_capacity(20_000_000);
    let mut input = RepInputIterator::new(10, 50);
    run_simulation(&mut buf, &mut *gate, &mut input, 4000).unwrap();

    // Write simulation to foo.vcd
    let mut file = File::create("foo.vcd").expect("Unable to create file");
    file.write_all(&buf).expect("Error writing vcd");

    // Print netlist JSON
    yosys_netlist(&*gate);
}

This function should probably be named parse_file_and_run_simulation instead, because there is another parse_file function in the parser module:

pub fn parse_file(filename: &str, top: &str) -> Box<Component> {
    let file = File::open(filename).expect("Unable to open file");
    let mut buf_reader = BufReader::new(file);
    let mut bs = String::new();
    buf_reader.read_to_string(&mut bs).unwrap();

    let c = comphdl1::FileParser::new().parse(&bs).unwrap();
    let s = ComponentFactory::new(c);
    let mux = s.create_named(top);
    println!("{:#?}", mux);

    mux
}

This is the real parse_file, it calls the FileParser that is generated by LALRPOP, and returns a (CompInfo, Vec<CompInfo>).

ComponentFactory

pub struct ComponentFactory {
    comp_id: HashMap<String, CompId>,
    components: HashMap<CompId, CompInfo>,
    comp_def: HashMap<CompId, CompDefinition>,
}

This is the ComponentFactory. It basically stores all the CompInfo: name, inputs, outputs (the output of the parser) and also the CompDefinition, which stores the connections needed to create the component. This information is accessed using the CompId, which is a unique index for each component name: for example “Nand” -> 0, “ConstantBit” -> 1.

pub struct CompId(usize);

pub struct CompInfo {
    name: String,
    inputs: Vec<String>,
    outputs: Vec<String>,
}

pub struct CompDefinition {
    comp: Vec<CompId>, // global component id, including c_zero
    connections: HashMap<ComponentIndex, Vec<ComponentIndex>>, // connections[local_comp_id][output_id]
    generics: HashMap<usize, (usize, usize)>,
}

CompDefinition stores everything that is needed to create a component: the CompId of the other components needed, and the connections. There is also a “generics” field, which is used for the Nand gate, since it can have any number of inputs but we need to know that number when creating it. So if the Nand with 4 inputs is the comp[2], then generics[2] will contain (4, 1).

The steps to create the ComponentFactory are basically:

  • Create the empty hashmaps
  • Insert builtin components (Nand, ConstantBit)
  • Insert other components (from the parser)
  • Create CompDefinitions for the other components

The CompDefinition::new() function connects the components together. That is accomplished by using another hashmap, which maps the signal name to the ComponentIndex (the component and port where it is connected). So if we do something like:

component Buf2(a) -> a1 {
    Buf(a) -> a0;
    Buf(a0) -> a1;
}

Then the signals hashmap will contain two indexes in the entry for “a0”: it’s the output of Buf(a), and also the input of Buf(a0). Then we just need to verify that there is only one output connected to a0 (in this case Buf(a)), because that’s our way to avoid short-circuits, and finally add a connection from Buf(a).output0 to Buf(a0).input0.

So now we have a ComponentFactory, how do we create the Components?

// impl ComponentFactory
fn create(&self, c_id: CompId) -> Box<Component> {
    let ref inputs = self.components[&c_id].inputs;
    let ref outputs = self.components[&c_id].outputs;
    let ref name = self.components[&c_id].name;

    println!("Creating component with id {}: {}", c_id.0, name);
    let ref def = self.comp_def[&c_id];

    let c_zero = CompIo::c_zero(inputs.len(), outputs.len());
    let mut c = vec![c_zero];
    
    for (local_id, &new_id) in def.comp.iter().enumerate().skip(1) {
        let (num_i, num_o) = def.generics[&local_id];
        let boxed_gate = if let Some(c) = self.create_builtin(new_id, num_i, num_o) {
            println!("DEBUG: Created builtin gate {}", self.components[&new_id].name);
            c
        } else {
            self.create(new_id)
        };
        let mut x = CompIo::new(boxed_gate);
        c.push(x);
    }

    for (from, to) in &def.connections {
        let ref mut x = c[from.c_id];
        assert!(from.is_output());
        for ref to in to {
            assert!(!to.is_output());
            x.add_connection(from.port_id, Index::new(to.c_id, to.port_id));
        }
    }

    let pn = PortNames::new_vec(inputs.clone(), outputs.clone());
    let gate = Structural::new(c, inputs.len(), outputs.len(), &self.components[&c_id].name, pn);

    Box::new(gate)
}

The process is similar to what we used to do manually: create the components and add connections, but now we have everything stored so the process is simple. We use recursion to create the inner components, except when the inner component is a builtin (like a Nand). Then we add the connections, the port names, and finally create the Structural and return the boxed component. We could probably return a Structural instead of a Box<Component>, since the other components like the Nand can be trivially converted into a Structural:

component Nand4(a, b, c, d) -> x {
    Nand(a, b, c, d) -> x;
}

Oh yeah, that’s a problem, since the emit_json.rs code only works on Structural, because Components don’t necessarily have any connections. As a reminder, Component is a trait and Structural implements that trait, Nand and ConstantBit also implement it. We can’t downcast the Component to Structural, because it may be a Nand instead, so a nice hack is to add this method to the Component trait:

// trait Component
fn clone_as_structural(&self) -> Option<Structural> {
    None
}

Leave that default implementation for Nand and other components, but implement it for Structural:

// impl Component for Structural
fn clone_as_structural(&self) -> Option<Structural> {
    Some(self.clone())
}

But to clone a Structural we need to be able to clone Components, since a Structural owns many Box<Component>. There is another hack to allow this:

trait Component {
    fn box_clone(&self) -> Box<Component>;
}

impl Clone for Box<Component> {
    fn clone(&self) -> Box<Component> {
        self.box_clone()
    }
}

impl Component for Nand {
    fn box_clone(&self) -> Box<Component> {
        Box::new((*self).clone())
    }
}

We manually implement Clone for a boxed component, which relies on the box_clone method, which can be easily implemented just by copy-pasting that one line of code.

For future reference, if you ever find yourself doing stuff like that, then your code is horribly wrong. In my case I will try to fix that some day.

Avoiding duplication

Now that we introduced the ComponentFactory, that should be a main feature, so let’s rebuild the simulator arround it. Let’s begin with the components. Right now every instance of every component stores its name and its port names, so if we create 100 Demux_4_1 its name and port names will be stored 100 times in memory.

That isn’t necesary, we could store that data in a common place. Such a place already exists, it’s the components field in the ComponentFactory. Actually, we don’t even need to store the connections since they also are common. We can just store a reference to the definition of this component. That would leave the Structural as just:

struct Structural {
    signals: Vec<CompIo>, // components and state
    info: Rc<CompInfo>, // name and port names
    connections: Rc<CompDefinition>, // components and connections
}
struct CompIo {
    comp: Box<Component>,
    input: Vec<Bit>,
    output: Vec<Bit>,
}

We can use reference counting pointers Rc to store references in a safe way. Another option would be to store the CompId and a reference to the ComponentFactory, and get info and connections from there (but how do we get a Rc<ComponentFactory> in the create() method?). Either way I think that we should separate code from data. So let’s do the first option, we need Rc<CompInfo> and Rc<CompDefinition> so we must change the ComponentFactory definition to:

pub struct ComponentFactory {
    comp_id: HashMap<String, CompId>,
    components: HashMap<CompId, Rc<CompInfo>>,
    comp_def: HashMap<CompId, Rc<CompDefinition>>,
}

Luckly we can transform a HashMap<K, V> into a HashMap<K, Rc<V>> using the into_iter/map/collect combo:

let components = components.into_iter().map(|(k, v)| (k, Rc::new(v))).collect();
let comp_def = comp_def.into_iter().map(|(k, v)| (k, Rc::new(v))).collect();

And magically the code compiles.

To change the Structural we need to do many things, but there is a small problem with the connections. Rc<CompDefinition> stores the connections as a HashMap, and we cannot iterate over all the connections of a component when using a hashmap. Well, we can but it won’t be as efficient as a vector. So for now let’s just store the connections as a Vec instead of using a Rc. This will allow a small optimization:

fn propagate(&mut self, c_id: usize) {
    // TODO: avoid this clone
    let connections = self.components[c_id].connections.clone();
    for (out_id, to) in connections.iter().enumerate() {
        for i in to {
            self.components[i.comp_id]
                .input[i.input_id] = self.components[c_id].output[out_id];
        }
    }
}

Here we want to modify self.components to propagate the signal. Since the connections are also stored in self.components, Rust doesn’t allow us to iterate over something that may be destroyed. So we are forced to clone the connections, which is relatively expensive. But if we store the connections in Structural instead, this problem disappears:

    let connections = &self.connections[c_id];

This is important, because if you think about it, CompIo is a component’s interface, so it must know the connections, but the connections are represented as indexes which can only be used by the Structural, so it makes more sense to store them outside the CompIo.

Dirty components

While we are at it, let’s add another simple optimization. Currently, we update everything every time. But most of the time, if one component’s input doesn’t change, the output will also not change. So let’s add a new method to the Component trait:

// trait Component
// Does this component need an update even if the inputs haven't changed?
fn needs_update(&self) -> bool {
    true
}

This allows components to avoid being updated when it isn’t necessary. A component will always be updated when its inputs change. But when they don’t change, we ask the component if it needs an update. If it doesn’t, it basically goes to sleep until some input signal changes.

The default is that a component will always be updated, and we must opt-in to this optimization.

// Nand
fn needs_update(&self) -> bool {
    false // The output depends only on the inputs
}
// ConstantBit
fn needs_update(&self) -> bool {
    // This component has no inputs, so it should only be updated once
    false
}
// Structural
fn needs_update(&self) -> bool {
    // a structural needs an update if any of its components does
    self.component_dirty.iter().skip(1).any(|&x| x == true)
}

Our builtin components, the Nand and the ConstantBit don’t have any state, so they only need to be updated when their inputs change (in the case of the ConstantBit which doesn’t have any inputs, they never change so it will never be updated, which is what we want).

The Structural is a bit more complicated because we need to keep track of which internal signals change, because that will trigger an update. We store the internal components which need an update as component_dirty: Vec<bool>. I’m not sure if “dirty” is the right term, but that’s what is used in graphics: when you resize a window it must be redrawn so it is marked as dirty. A quick google search shows that this may be a correct term, as it’s used to refer to dirty bits . So how do we mark components as dirty? Easy, just change a bit the update_components function:

fn update_components(&mut self) {
    for c in 1..self.components.len() {
        // Iterate over dirty components only
        if self.component_dirty[c] == false {
            continue;
        }
        self.components[c].update();
        self.component_dirty[c] = self.components[c].comp.needs_update();
    }
}

If a component is dirty, update it and ask if it need to be updated again. But we also need to force an update when the inputs change, so now we need to modify the propagate function again:

fn propagate(&mut self, c_id: usize) {
    let connections = &self.connections[c_id];
    for (out_id, to) in connections.iter().enumerate() {
        for i in to {
            if self.components[i.comp_id]
                .input[i.input_id] != self.components[c_id].output[out_id] {

                self.components[i.comp_id]
                    .input[i.input_id] = self.components[c_id].output[out_id];
                self.component_dirty[i.comp_id] = true;
            }
        }
    }
}

It may be a bit hard to see, but it’s basically:

// Before:
a = b;

// Now
if a != b {
    a = b;
    dirty = true;
}

There is also a small addition to CompIo, now it stores if the output has changed in the last update. That’s very useful because if the output doesn’t change we can skip propagating the signals for this component. So we add a output_changed: bool flag, and create the update function:

pub fn update(&mut self) {
    let new_output = self.comp.update(&self.input);
    if new_output == self.output {
        self.output_changed = false;
    } else {
        self.output = new_output;
        self.output_changed = true;
    }
}

Nice optimizations, let’s do some benchmarks. We are only interested in the simulation time, but let’s just make it very long, that should be enough. I run a simulation for 4 million ticks, which would generate about 1 GB of data in vcd format so it’s important to not save it. The component will be the demultiplexer from last post.

And thanks to the magic of version control we can go back in time and run the same “benchmark” in the old version:

git checkout blog-05
cargo run --release
time !!
git checkout master

Here is a table comparing the results:

T 1 5 50 500
Old 0m55,264s 0m55,240s 0m54,302s 0m53,755s
New 0m45,917s 0m43,273s 0m37,377s 0m39,264s

I changed one parameter to better see the real improvement: T. The input remains constant during “T” ticks, and obviously when the input remains constant for a longer time, the component updates will be less frequent, so the dirty component optimization kicks in. Note that even when the input changes every tick (T=1), there are still bits which don’t change:

0000
0001
0010
0011

The 2nd input from the right changes every 2T, the 3rd every 4T, etc, so the optimization also helps in that case.

Also note that the time is that of 4 million ticks, so 1 tick is about 0.011 ms for this simple component. If we ever want to run an interactive simulation at a standard 60 fps (16.6 ms per tick), we could simulate more than 1000 components! At least in theory.

Conclusion

So yeah, this project is starting to grow, but there’s still many things to do.

The source code is available here:

https://github.com/Badel2/comphdl/tree/blog-06

Part 7

Written on May 30, 2018