- Manchester, UK
- in/ali-naeimi5055
Pinned Loading
-
nanoplm
nanoplm PublicForked from peymanvahidi/nanoplm
Dev fork of nanoplm, with focus on throughput and arch optimizations in pretraining.
Python 1
-
flash-mHC
flash-mHC PublicFast implementation of manifold-constrained hyperconnections in Triton.
Python
-
nanogpt-fp8
nanogpt-fp8 PublicNanochat inspired LLM pretraining using Transformer-Engine with MXFP8 and NVFP4 support. Up to 30% faster than nanochat
-
matmul_assembly_x86
matmul_assembly_x86 PublicHyper-optimized FP32 GEMM kernels in handwritten AVX2 ASM with a worklog of optimizations implemented
Assembly 4
-
llm.c
llm.c PublicForked from karpathy/llm.c
3x faster LLM training on CPU than Karpathy's original repo
Cuda 1
-
Candles-ProbCheck
Candles-ProbCheck PublicSimple notebook to check probability of every pair of candlesticks in a day being of the same or opposite color in a given period of time
Jupyter Notebook
If the problem persists, check the GitHub status page or contact support.



