Projects

Open-source and research projects in chip design, computer architecture, and hardware engineering. Most are open-source and under active development.

RISC-V Out-of-Order Core

active

A parameterized out-of-order RISC-V processor core with 6-wide issue, branch prediction, and non-blocking L1 cache.

  • 6-wide superscalar out-of-order execution
  • TAGE branch predictor with 95%+ accuracy
  • Non-blocking L1 data cache with MSHRs
  • Verified with riscv-dv random instruction generator
  • Synthesized at 1.2 GHz on 28nm process
SystemVerilog
RISC-V
VCS
Verdi
Python

NPU Performance Simulator

active

Cycle-accurate simulator for systolic-array based NPU architectures, supporting various dataflows and memory hierarchies.

  • Configurable array size (16x16 to 128x128)
  • Weight-stationary and output-stationary dataflows
  • Multi-level memory hierarchy modeling
  • DRAM bandwidth bottleneck analysis
  • Integrated with PyTorch for workload traces
C++
Python
SystemC
NumPy

FPGA-Based Radar Signal Processor

completed

Real-time FMCW radar signal processing pipeline on Xilinx Zynq, including FFT, CFAR detection, and angle estimation.

  • 1024-point FFT pipeline with 4-cycle throughput
  • 2D CFAR detector with configurable guard cells
  • MUSIC algorithm for angle-of-arrival estimation
  • AXI-Stream interface for data movement
  • Real-time processing at 100 MSPS
Verilog
Vivado
Zynq
MATLAB
C

Cache Coherence Verification Framework

completed

Formal verification framework for cache coherence protocols using SystemVerilog Assertions and JasperGold.

  • Support for MSI, MESI, and MOESI protocols
  • Automated litmus test generation
  • Coverage-driven verification methodology
  • Integration with tilelink/ACE interfaces
  • Reported and fixed 3 protocol-level bugs
SystemVerilog
SVA
JasperGold
Python

LLVM Backend for Custom AI Accelerator

active

Custom LLVM backend for a proprietary AI accelerator ISA, including instruction selection, scheduling, and code generation.

  • Custom ISA with vector and matrix extensions
  • TableGen-based instruction definitions
  • MLIR dialect for high-level operations
  • Loop tiling and fusion optimizations
  • Achieved 78% of peak theoretical throughput
C++
LLVM
MLIR
Python

Open-Source AMBA Testbench

archived

Comprehensive UVM-based testbench for AMBA AXI4/AHB protocols with scoreboarding and coverage collection.

  • AXI4, AXI4-Lite, and AHB-lite VIP components
  • Constrained-random transaction generation
  • Functional coverage model with 100% coverage target
  • Reusable agent architecture
  • Open-source under MIT license
SystemVerilog
UVM
AXI
AHB