Calculate the Jensen-Shannon divergence between the answer distribution generated from the full context and the distribution without specific sentences. The higher this value is for a sentence, the more the model "relied" on that sentence to generate its answer.
Instead of generating new answers, it measures and sums up the changes in token probability distributions when generating the same answer, which makes it computationally expensive (less expensive than Contextcite) requiring O(sentence) inferences.