Response Identification Question

I already wrote Convai support with this so I am sorry if you see it twice but it is important.

I have a question regarding the consistency and replicability of responses provided by your platform over time.

If I created an agent on your platform six months or a year ago and asked it a series of questions, then asked the exact same questions today in the same order—without making any changes to the agent—would someone analyzing the two chat logs be able to tell that one of them was generated more recently than the other?

I understand that the responses would likely not be identical due to the inherent randomness of LLMs. However, would someone analysing the chat logs be able to determine that one of them must have been generated recently based on how the responses are formulated, structured, or other characteristics? Or would it be impossible to tell whether they were both generated today or a year ago? I am not concerned about the quality of the responses but rather whether factors such as model updates may have introduced differences that could indicate when a response was generated.

For example, let’s say I generate some responses from one of your agents today by asking it a series of questions. If I were to release the chat logs publicly and claim that they were generated six months or a year ago, would there be any way for someone to disprove this by analysing the chat logs?
In other words, have model updates or any other factors introduced differences in responses that could allow a tech savvy person to determine that a conversation was generated recently rather than in the past? Or would it be impossible to distinguish whether the responses were created today or a year ago?

I realize this might be a strange question but it is really important for the viability of my project.

Please let me know if anything needs clarification.

I appreciate any insights you can provide.

Best regards.

I see the current version (3.2.0) was introduced October 31, 2024.
Does this mean that since then the response behavior should be the same till this day?

Welcome to the Convai Developer Forum

Hey @Zacrebleu_Sola-P :wave: , I am Kamal from the LLM team. You have an interesting question there.

As far as the answer goes, I would say its a little tough to judge and tell difference between two outputs based on the time they were generated; as long as the model that you chose is same.

And as far as the time frame goes, newer models are coming quite rapidly so a time frame of 1 year or even 6 months is pretty long, I don’t think there will be any major updates/patches that change the behaviour of the model a lot.

Even if the model has any changes to it, usually it reflects in the model name, for example claude 3.5 sonnet has some versions like: claude-3-5-sonnet-20241022 or claude-3-5-sonnet-20240620, similar versions exist for gpt 4o, gemini models too

usually these kind of patches are to fix any breakages or breach of guardrails in the models, or perhaps to improve inference speeds.

Unless the output from your LLMs fall under the specific fixes that these models are undergoing, it would be a little tough to tell if the output is coming freshly now or from 6 months ago.