speculative-decoding

Here are 21 public repositories matching this topic...

intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

retrieval chatbot rag habana large-language-model chatpdf llm-inference 4-bits speculative-decoding llm-cpu streamingllm intel-optimized-llamacpp neural-chat neural-chat-7b autoround gaudi3

Updated Oct 8, 2024
Python

PygmalionAI / aphrodite-engine

Star

Large-scale LLM inference engine

machine-learning cuda intel api-rest lora rocm inference-engine tpu inferentia speculative-decoding

Updated Nov 4, 2024
Python

SafeAILab / EAGLE

Star

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)

large-language-models llm-inference speculative-decoding

Updated Sep 27, 2024
Python

Infini-AI-Lab / Sequoia

Star

scalable and robust tree-based speculative decoding algorithm

efficiency inference llm speculative-decoding

Updated Aug 13, 2024
Python

Infini-AI-Lab / TriForce

Star

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

acceleration efficiency inference llm long-context llm-inference speculative-decoding

Updated Aug 31, 2024
Python

facebookresearch / LayerSkip

Star

Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024

optimization transformers early-exit llm speculative-decoding layer-drop

Updated Oct 27, 2024
Python

FasterDecoding / REST

Star

REST: Retrieval-Based Speculative Decoding, NAACL 2024

retrieval llm-inference speculative-decoding

Updated Sep 25, 2024
C

kssteven418 / BigLittleDecoder

Star

[NeurIPS'23] Speculative Decoding with Big Little Decoder

decoding efficient-inference speculative-execution fast-inference llm speculative-decoding

Updated Feb 6, 2024
Python

hemingkx / SpecDec

Star

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

non-autoregressive speculative-decoding

Updated Dec 9, 2023
Python

romsto / Speculative-Decoding

Star

Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.

fast-inference llm llm-inference speculative-decoding llm-optimization

Updated Oct 28, 2024
Python

hemingkx / SWIFT

Star

SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration

speculative-decoding

Updated Oct 10, 2024
Python

mscheong01 / speculative_decoding.c

Star

minimal C implementation of speculative decoding based on llama2.c

c artificial-intelligence llm llama2 speculative-decoding

Updated Jul 15, 2024
C

AutonomicPerfectionist / PipeInfer

Sponsor

Star

PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation

inference llm llamacpp speculative-decoding

Updated Aug 16, 2024
C++

pinqian77 / Dynasurge

Star

Dynasurge: Dynamic Tree Speculation for Prompt-Specific Decoding

large-language-models speculative-decoding

Updated Apr 29, 2024
Python

u-hyszk / japanese-speculative-decoding

Star

Verification of the effect of speculative decoding in Japanese.

nlp japanese fast-inference speculative-decoding

Updated Mar 4, 2024
Python

wtlow003 / speculative-sampling

Star

Implementation of Speculative Sampling in "Accelerating Large Language Model Decoding with Speculative Sampling"

deepmind llm-inference speculative-decoding speculative-sampling

Updated Aug 20, 2024
Python

haukzero / Speculative-Demo

Star

一个简单的投机推理实现

transformers pytorch speculative-decoding

Updated Sep 26, 2024
Python

PopoDev / BiLD

Star

Reproducibility Project for [NeurIPS'23] Speculative Decoding with Big Little Decoder

reproducibility fast-inference llm speculative-decoding

Updated May 30, 2024
Python

majid-daliri / DISD

Star

Coupling without Communication and Drafter-Invariant Speculative Decoding

llm-inference speculative-decoding

Updated Aug 17, 2024
Python

kinshukdua / SpecDec

Star

Some experiments aimed at increasing LLM throughput and efficiency via Speculative Decoding.

inference llm speculative-decoding

Updated Jul 31, 2023
Python

Improve this page

Add a description, image, and links to the speculative-decoding topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speculative-decoding topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speculative-decoding

Here are 21 public repositories matching this topic...

intel / intel-extension-for-transformers

PygmalionAI / aphrodite-engine

SafeAILab / EAGLE

Infini-AI-Lab / Sequoia

Infini-AI-Lab / TriForce

facebookresearch / LayerSkip

FasterDecoding / REST

kssteven418 / BigLittleDecoder

hemingkx / SpecDec

romsto / Speculative-Decoding

hemingkx / SWIFT

mscheong01 / speculative_decoding.c

AutonomicPerfectionist / PipeInfer

pinqian77 / Dynasurge

u-hyszk / japanese-speculative-decoding

wtlow003 / speculative-sampling

haukzero / Speculative-Demo

PopoDev / BiLD

majid-daliri / DISD

kinshukdua / SpecDec

Improve this page

Add this topic to your repo