Attention mechanism in LLM