From: François Fleuret Date: Tue, 25 Jun 2024 06:42:23 +0000 (+0200) Subject: Update. X-Git-Url: https://fleuret.org/cgi-bin/gitweb/gitweb.cgi?a=commitdiff_plain;h=779e3675414e061ad294c6b5599a7843d9e887bc;p=culture.git Update. --- diff --git a/report/culture.tex b/report/culture.tex index 7bf330e..43aaefe 100644 --- a/report/culture.tex +++ b/report/culture.tex @@ -198,7 +198,13 @@ present in the original quizzes: quizzes in the representation landscape of GPTs. There probably is a subtle relation between the temperature (mutation rate) and the number of models used to validate with the ``all but one'' criterion - (survival). + (survival criterion). + +\item The ``all but one'' could be ``all but K'', and there may be + some information-theoretical thing, where the goal is to maximize + mutual information, with $K=N$ being total randomness, so high + entropy but no structure, and $K=0$ is total determinism, so no + information to share. \item The setup does not push toward any specific invariance or property in the generated quizzes, their consistency is entirely due @@ -216,17 +222,21 @@ present in the original quizzes: \item This overall process probably fight the ``simplicity bias'': If a model is lacking a ``cue'' that the others have, there will - rapidly be quizzes that requires this cue, they will be added to the + rapidly be quizzes that require this cue, they will be added to the training data, and that model will catch up. \item The randomness of the process probably allow to even go beyond just synchronizing the abilities of the models. There may be some additional complexification of quizzes that get accepted by chance. -\item The current process to generate new quizzes, which simply sample - them at random is very rudimentary and probably not sufficient in a - real-data setup. It can probably be supplemented with a MCTS-type - search. +\item The current process to generate new quizzes, which simply + samples them at random is very rudimentary and probably not + sufficient in a real-data setup. It can probably be supplemented + with a MCTS-type search. + +\item There may be already in the generated quizzes some structure + that \textemph{we} do not pick up (e.g. certain color or motion + patterns). \end{itemize}