loadValuesForAttention
fun loadValuesForAttention(layer: Int, startPos: Int = 0, endPos: Int = cache.currentSeqLen): FloatArray(source)
Load cached values for attention, dequantizing as needed.
Parameters
layer
Layer index
startPos
Start of the attention window (inclusive)
endPos
End of the attention window (exclusive)