LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Paper
•
2404.16710
•
Published
•
80
<|YUE_START|> and <|YUE_END|>. The chat template isformatted_text = f"<|TEXT_UNDERSTANDING_START|><|YUE_START|>{input_text}<|YUE_END|><|TEXT_UNDERSTANDING_END|>"
chat = [
{"role": "user", "content": "Convert the text to speech:" + formatted_text},
{"role": "assistant", "content": "<|SPEECH_GENERATION_START|>" + ''.join(speech_ids_prefix)}
]