Hugging Face Model Distillation reduces the size of LLMs while preserving accuracy by training smaller models to mimic larger ones, ideal for edge deployments.
https://huggingface.co/docs/optimum/model-distillation