Course: Regular expression VHDL engine

Learn to build a regex processing VHDL module with runtime-reconfigurable pattern matching and Unicode support, using AI-generated Python scripts to create configurations from regexes.

Category: Tags: , ,

Description

This course teaches how to create a text-processing pipeline of VHDL modules that support UTF-8 (Unicode) and can search for regular expression (regex) patterns in the text stream.

You will learn how to create a reconfigurable finite-state machine (FSM) that allows us to upload a new regex pattern at runtime.

Furthermore, we’ll reduce the pipeline’s data width by creating a classifier VHDL module that reclassifies 21-bit Unicode symbols into custom VHDL types that require fewer bits.

Finally, we’ll make a top-level testbench that uploads the configuration bytes for a given regex pattern to the regex engine and classifier module before streaming a UTF-8 test file through the pipeline.

This course is only available in the VHDLwhiz Membership.

The membership subscription gives you access to this and many other courses and VHDL resources.

You pay monthly to access the membership and can cancel the automatic renewal anytime. There is no lock-in period or hidden fees.

No FPGA board is required as this course is a pure simulation exercise.

Software used in the course

I use Windows 11 in the course. All the other software is available for free for Windows and Linux:

Course outline

Number of lessons:
19
Average video duration:
13m57s
Total video duration:
4h25m

The overview below shows the lessons in this course.

video lesson icon/default Created with Sketch.

1 - Introduction

Welcome to the course! Let's talk about regex and what we'll create.

video lesson icon/default Created with Sketch.

2 - Regex to NFA and DFA

We'll use Thompson's construction algorithm to convert the regular expression into a non-deterministic finite automata (NFA) and then to a deterministic finite automata (DFA) state machine.

video lesson icon/default Created with Sketch.

3 - Regex with branching and merging NFA paths

Let's walk through a slightly different regex that requires the NFA to have parallel paths that converge in the end before the accepting state.

video lesson icon/default Created with Sketch.

4 - Regex testbench and module entity

First, we'll create a regex VHDL engine implementation that only supports ASCII and a static search pattern. But let's start with the testbench.

video lesson icon/default Created with Sketch.

5 - Static pattern FSM

To get started on the VHDL regex engine, we'll first implement a finite-state machine (FSM) supporting only a static pattern: (ab)+.

video lesson icon/default Created with Sketch.

6 - FSM implementation challenges

Whether a substring is a full match depends on a variable number of characters before and after, and that’s tricky to handle in hardware.

video lesson icon/default Created with Sketch.

7 - Class map outline

Our regex module shall support Unicode (UTF-8). But won't that make the data width very wide? Not necessarily.

video lesson icon/default Created with Sketch.

8 - Class map config and testbench

The subset of Unicode characters this module can classify shall be runtime reconfigurable. Let's use a daisy-chained shift register for that.

video lesson icon/default Created with Sketch.

9 - Class map classifier logic

The classifier logic will use parallel comparators to check if the input Unicode character matches a codepoint we are looking for.

video lesson icon/default Created with Sketch.

10 - TXT to Graphviz using AI

We'll use AI (ChatGPT) to generate the Python scripts for us. Let's start with the graph visualizer tools.

video lesson icon/default Created with Sketch.

11 - NFA and DFA generator scripts

Using ChatGPT Codex in VSCode, we'll create two more Python scripts to derive the NFA and DFA graphs from regex patterns.

video lesson icon/default Created with Sketch.

12 - Regex FSM states config

To make the regex FSM reconfigurable, we'll add a configuration port and an array to store the config data for each state in registers.

video lesson icon/default Created with Sketch.

13 - Regex config mapping

Let's use nested generate statements to map the config shift register bytes to the record members in the states array.

video lesson icon/default Created with Sketch.

14 - Regex config data simulation

It's time to update the testbench to upload the dynamic configuration bytes for a DFA graph to the regex engine VHDL module.

video lesson icon/default Created with Sketch.

15 - Regex test data procedure

In this lesson, we'll convert the send_str procedure in the regex testbench to send class_t type symbols instead of ASCII characters.

video lesson icon/default Created with Sketch.

16 - Reconfigurable FSM

Finally, we can complete the runtime reconfigurable implementation of the FSM process by using the configuration data.

video lesson icon/default Created with Sketch.

17 - Top module

Now that we have all the modules needed to build a UTF-8 text-processing regex pipeline, we'll put them in a top-level VHDL file.

video lesson icon/default Created with Sketch.

18 - Top testbench config data

Fortunately, we can reuse procedures from the regex and class_map testbenches to upload configuration data to the top module.

video lesson icon/default Created with Sketch.

19 - Top testbench UTF-8 text test

Let's test the VHDL regex engine by streaming a UTF-8 (Unicode) text file through the pipeline.

This course is only available in the VHDLwhiz Membership.

The membership subscription gives you access to this and many other courses and VHDL resources.

You pay monthly to access the membership and can cancel the automatic renewal anytime. There is no lock-in period or hidden fees.

Reviews

There are no reviews yet.

Be the first to review “Course: Regular expression VHDL engine”

Your email address will not be published. Required fields are marked *