skip to content
underscore
Dark Theme

Projects

BitNet on GPT

An attempt to implement the bitnet paper on the GPT. Built on top of NanoGPT in pyTorch. It also contains implementation of The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits.

I’ve not been able to train it on a large scale yet!!.

Mini ROS

This is a lightweight reimplementation of core ROS concepts, focusing on roscore and topic-based communication for subscribing and publishing. Built entirely in Go without external libraries, it mimics essential ROS behavior in a minimalistic way. The roscore server manages message exchanges between nodes, while topics enable asynchronous communication.

This aims to mimic the essential behavior of ROS in a minimalistic way, making it easier to understand the underlying mechanisms while maintaining flexibility and performance due to Go’s concurrency model.

commandspurposes
coreTo start roscore server on master url as in ROS
subscribeTo subscribe to a topic
publishTo publish a topic
statusTo get stats of a topic

TODO

  • ROS core
  • Publish topic
  • Subscribe to topic
  • message types
  • get topic metrics
  • better CLI
  • Create more realistic topic /cmd_vel or /raw_image
  • ROS Node
  • ROS simple client library
  • ROS service
  • ROS launch file

NAT WSL

Hi, I recently ran into a problem when trying to access my server, which is running on my WSL Ubuntu distro, through my Wi-Fi router’s IP address. The thing is, WSL doesn’t automatically allow access through the router. Instead, it uses NAT (Network Address Translation) to route traffic through Windows, which is how I can access the internet from within WSL.

I’d like to set up a more comprehensive and straightforward NAT configuration. Ideally, I’d want to define the ports and settings for my WSL distro in a simple YAML file that automatically starts up when my system boots.

One way to simplify this process would be to run a script that calls netsh, which is the command-line tool Windows uses for managing network configurations. This script could handle all the NAT settings at runtime.

However, we’re aiming to build a Network Address Translation system from scratch because we find it challenging and want to understand how it works in depth. By doing so, we’ll gain hands-on experience with network fundamentals and have better control over our setup

Reference

My LazyVim Config

A perfectly curated Neovim config. Built with Neovim, LazyVim and Mason.

AI on Web

Running AI models on Web.

Using Onnx-runtime-web to run BERT for sentimental analysis on the web.

This repository contains a simple implementation of ONNX (Open Neural Network Exchange) using the microsoft/xtremedistil-l6-h256-uncased model. The ONNX model is located in the onnx/model.py file, and we’ve also provided the exported classifier model in both onnx/classifier.onnx and onnx/classifier_int8.onnx formats.

Model Information

  • Model Used: microsoft/xtremedistil-l6-h256-uncased
  • ONNX Model Location: onnx/model.py
  • Exported Classifier Models: onnx/classifier.onnx and onnx/classifier_int8.onnx
  • Colab Notebook: notebook

transformer

Just another implementation of the transformer model as introduced in the paper Attention is all you need, this is a step-by-step process to building a transformer.

A tutorial project for understanding how the transformer works

trBPE: A Byte Pair Encoder tailored for Turkish

The current landscape of Large Language Models (LLMs) predominantly caters to the English language. This bias can be attributed to extensive training on English datasets and the efficacy of tokenization. Notably, OpenAI tokenizer for GPT-4’s excels in contextualizing tokens based on syllabic divisions, enhancing comprehension and generation capabilities.

However, for foreign languages like Turkish, this advantage diminishes due to tokenization randomness. To address this, a repository was created to develop a BPE tokenizer tailored to Turkish, using rich Turkish language datasets.

This was used by KomRade in the competition...

In an attempt to replicate methods outlined in this paper, with exceptions:

  • Non-agglutinative pieces are preceded by a space, and agglutinative pieces aren’t # prefixed.
  • Tokenization is case-insensitive.

Video Compress

A simple web app to convert videos from H.264 to H.265 encoding, significantly reducing file size while maintaining quality.

Why H.265?

H.265 (HEVC) is the successor to H.264 (AVC). It offers better compression, allowing for smaller file sizes or higher quality at the same bitrate. This project uses FFmpeg to convert videos from H.264 to H.265.