Prompt data is pointless and useless without a human to create a feedback loop for it, at which point it wouldn’t have context anyway. Also human effort to correct spelling dnd other user errors at the outset anyway. Hugely pointless and unreliable.
Not to mention, what good would it do for training? It wouldn’t help the model at all.
You can collect the data and figure out how to use it later. Just look at the Google leaks lately and what they collect, it’s literally everything down to the length of clicks and full walks through the site
Collecting data about user interests is in itself valuable, and it’s plausible to use various metrics to analyze it, something as simple as sentiment analysis, which has been broadly done. Sentiment analysis has predated modern ML by a long margin, but you can read the wiki page on that
But yeah just think about stuff like Google trends, tracking interest in topics, as an example of what such data could be used for. And deanonymizing the inputs is probably possible to some degree, aside from the obvious trust we place in DDG as a centralized failure point
You’re confusing analytics with direct input storage and reuse of prompt data to train somehow, as in your original comment.
Analytics has absolutely nothing to do with their model usage and training, and would pointless. Observing keywords and interests is standard analysis stuff. I don’t even think anyone even cares about it anymore.
Prompt data is pointless and useless without a human to create a feedback loop for it, at which point it wouldn’t have context anyway. Also human effort to correct spelling dnd other user errors at the outset anyway. Hugely pointless and unreliable.
Not to mention, what good would it do for training? It wouldn’t help the model at all.
You can collect the data and figure out how to use it later. Just look at the Google leaks lately and what they collect, it’s literally everything down to the length of clicks and full walks through the site
Collecting data about user interests is in itself valuable, and it’s plausible to use various metrics to analyze it, something as simple as sentiment analysis, which has been broadly done. Sentiment analysis has predated modern ML by a long margin, but you can read the wiki page on that
But yeah just think about stuff like Google trends, tracking interest in topics, as an example of what such data could be used for. And deanonymizing the inputs is probably possible to some degree, aside from the obvious trust we place in DDG as a centralized failure point
You’re confusing analytics with direct input storage and reuse of prompt data to train somehow, as in your original comment.
Analytics has absolutely nothing to do with their model usage and training, and would pointless. Observing keywords and interests is standard analysis stuff. I don’t even think anyone even cares about it anymore.