X-Git-Url: https://fleuret.org/cgi-bin/gitweb/gitweb.cgi?a=blobdiff_plain;f=report%2Fculture.tex;h=d9f39e30f7e0ff4a8a2a31d4020a6ca762e13bfc;hb=5f5c6c079c2751a76887444c211c5c464e875ed0;hp=7bf330ee439a1f3c33664a0d26024acc6c79670e;hpb=7d0b423ece4608825eed573eda70fd6b601cf80b;p=culture.git diff --git a/report/culture.tex b/report/culture.tex index 7bf330e..d9f39e3 100644 --- a/report/culture.tex +++ b/report/culture.tex @@ -5,7 +5,7 @@ %% Written by Francois Fleuret \documentclass[11pt,a4paper,oneside]{article} -\usepackage[paperheight=15cm,paperwidth=8cm,top=2mm,bottom=15mm,right=2mm,left=2mm]{geometry} +\usepackage[paperheight=15cm,paperwidth=8cm,top=2mm,bottom=15mm,right=5mm,left=5mm]{geometry} %\usepackage[a4paper,top=2.5cm,bottom=2cm,left=2.5cm,right=2.5cm]{geometry} \usepackage[utf8]{inputenc} \usepackage{amsmath,amssymb,dsfont} @@ -70,11 +70,44 @@ \def\concat{\operatorname{concat}} \def\crossentropy{\LL_{\operatorname{ce}}} +\newcommand{\separator}{\begin{center} +* +\end{center}} + +\newcommand{\pic}[2]{% +\hspace*{\stretch{1}} +% +\includegraphics[scale=0.25]{#1} +% +\hspace*{\stretch{1}}% +} + +\newcommand{\birdpic}[2]{% +\hspace*{\stretch{1}} +% +\includegraphics[scale=0.35]{#1} +% +\hspace*{\stretch{1}}% +} + +\newenvironment{example}{% + +\vspace*{2ex} + +\begin{minipage}{\textwidth} + +\setlength{\parindent}{0cm} +\setlength{\parskip}{1ex} +}{% +\end{minipage} +} + \begin{document} \vspace*{-3ex} \begin{center} + {\Large Self-Generated Culture} Fran\c cois Fleuret @@ -94,101 +127,379 @@ Fran\c cois Fleuret \section{Introduction} The hypothesis behind this experiment is that high-level abstract -thinking is fueled by social competition. A group of communicating -agents that try to demonstrate their cognitive superiority would end -up developing a rich and consistent culture. +thinking is fueled by social competition. + +A group of communicating agents that try to demonstrate their +cognitive superiority would end up developing a rich and consistent +culture. + +\subsection{Setup} The experiment is designed with a group of GPTs that alternatively learn to solve quizzes and generate new ones. -A ``quiz'' is a triplet of the form $(A, d, B)$ where $A$ and $B$ are -two sequences and $d$ is a token indicating if the direction is -forward or backward. Given $(A, d)$, the challenge is to generate $B$. +A ``quiz'' is a pair composed of a prompt and a solution, both being +sequence of tokens. + +We differentiate \textbf{world quizzes} that follow pre-defined and +fixed regularities, and mimic the world's physical and environmental +patterns that an organism has to grasp to survive, and \textbf{culture + quizzes} that are generated by the GPTs, and mimic the knowledge one +has to master to perform socially. + + +We train five GPTs on a a very large set of ``world quizzes'' +generated randomly. These models are trained to generate both the +solution given the prompt, and the prompt given the solution. + +This is achieved by using for training both ``forward sequences'', +composed of a token \texttt{[fwd]}, followed by the prompt's tokens, +followed by another token \texttt{[fwd]}, followed by the solution's +tokens, or ``backward sequences'' composed of a token \texttt{[bck]}, +followed by the solution's tokens, followed by another token +\texttt{[bck]}, followed by the prompt's tokens, + +\subsection{Generating Culture Quizzes} + +When their accuracy get above $95\%$ we generate new quizzes as follows: +% +\begin{enumerate} + +\item generate a solution (without conditioning) at temperature $T=2$, + then generate a prompt for that solution at temperature $T=1/2$, and + then generate a solution for that prompt at temperature $T=1/2$. + +\item generate one solution for that prompt with each of the $5$ GPTs + at temperature $T=1$, if $4$ of them generate the correct solution, + validate that quiz and include it in the training data. + +\end{enumerate} + +This criterion assures that the new quizzes are both solvable and +sophisticated, and incrementally complexify the culture. Imposing both +direction prevents the generation of quizzes which are not trivial +only because the prompt has been randomly degraded. + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +\pagebreak + +\section{Grid Quizzes} + +\subsection{World Quizzes} + +We define several types of quizzes and implement algorithmic +procedures to generate randomly as many examples from each that we +need. + +In these quizzes, the prompt is made of three grids $A, f(A), B$ and +the solution is a single grid $f(B)$. + +\subsubsection{Half Fill} + +\pic{pics/task_color_grow.png}{``half fill''} + +The first grid contains three rectangles, each with a vertical or an +horizontal line of another color in its middle. The second grid is +identical with one of the rectangle having one half filled. The third +grid contains three rectangles of identical colors as the firs grid, +of different size and locations. The solution is obtained by filling +similarly one of the half of a rectangle of the third image. + +\subsubsection{Detect} + +\pic{pics/task_detect.png}{``detect''} + +The first grid contains three rectangles, the second has two pixels of +same colors located in the top-left corner of two of them. The +solution is obtained by marking in the fourth image the top-left +corners of the rectangles of same colors in the third. + +\subsubsection{Frame} + +\pic{pics/task_frame.png}{``frame''} + +The first grid contains three rectangles, and the second is identical +except that one rectangle has been replaced by its frame. The same +should be done to the similarly colored rectangles of the third grid +to obtain the solution. + +\subsubsection{Grow} + +\pic{pics/task_grow.png}{``grow''} + +The first grid contains three rectangles, one of them getting one +pixel thicker or thinner in the second. The same should be done to the +similarly colored rectangles of the third grid to get the solution. + +\subsubsection{Replace color} + +\pic{pics/task_replace_color.png}{``replace color''} + +The first grid contains three rectangles, the second is obtained by +changing one of the colors. The same should be done to the third grid +to obtain the solution. + +\subsubsection{Translate} + +\pic{pics/task_translate.png}{``translate''} + +The first grid contains three rectangles. The second is obtained by +displacing one of them by one pixel in both direction. The solution is +obtained by applying the same motion to the similarly colored +rectangle in the third grid. + +%% \subsubsection{Bounce} + +%% \pic{pics/task_bounce.png}{``bounce''} + +%% The solution should join the two pixels of same color, with a path of +%% another color, starting in the direction indicated by a pixel of that +%% color, and changing direction only when colliding with a pixel of a +%% third color or one of the lattice border. + +%% \subsubsection{count} + +%% \pic{pics/task_count.png}{``count''} + +%% \subsubsection{scale} + +%% \pic{pics/task_scale.png}{``scale''} + +%% \subsubsection{trajectory} + +%% \pic{pics/task_trajectory.png}{``trajectory''} + +\subsection{Culture Quizzes} + +We list here some generated quizzes that exhibit features that were not present in the ``world quizzes'' used for training. + +\bigskip + +\begin{example} + +\pic{pics/culture_c_quiz_0110_N4_validated/quiz_63.png}{0110/63} + +\pic{pics/culture_c_quiz_0115_N4_validated/quiz_37.png}{0115/37} + +The quizzes ``frame'' and ``half fill'' have been combined in a single +quiz. + +\end{example} + +\separator + +\begin{example} + +\pic{pics/culture_c_quiz_0120_N4_validated/quiz_05.png}{0110/05} + +The ``frame'' quiz has been generalized to non-rectangular shapes. + +\end{example} + +\separator + +\begin{example} + +\pic{pics/culture_c_quiz_0078_N4_validated/quiz_01.png}{0078/01} + +\pic{pics/culture_c_quiz_0078_N4_validated/quiz_02.png}{0078/02} + +More rectangles were added as distractors. + +\end{example} -The experiments starts with a set of quizzes, that is going to be -progressively enriched. +\separator + +\begin{example} + +\pic{pics/culture_c_quiz_0087_N4_validated/quiz_62.png}{0087/62} + +\pic{pics/culture_c_quiz_0102_N4_validated/quiz_04.png}{0102/04} + +\pic{pics/culture_c_quiz_0102_N4_validated/quiz_11.png}{0102/11} + +\pic{pics/culture_c_quiz_0108_N4_validated/quiz_31.png}{0108/31} + +Variation of ``Detect'' with location markers colored according to the +color of the rectangle they mark. + +\end{example} + +\separator + +\begin{example} + +\pic{pics/culture_c_quiz_0078_N4_validated/quiz_16.png}{0078/16} + +\pic{pics/culture_c_quiz_0084_N4_validated/quiz_21.png}{0084/21} + +\pic{pics/culture_c_quiz_0078_N4_validated/quiz_42.png}{0078/42} + +\pic{pics/culture_c_quiz_0089_N4_validated/quiz_28.png}{0089/28} + +\pic{pics/culture_c_quiz_0084_N4_validated/quiz_00.png}{0084/00} + +Variations of ``Half Fill'', ``Detect'', ``Translate'', ``Grow'', and +``Frame'' with a number of rectangles not equal to three. + +\end{example} + +\separator + +\begin{example} + +\pic{pics/culture_c_quiz_0078_N4_validated/quiz_27.png}{0078/27} + +\pic{pics/culture_c_quiz_0078_N4_validated/quiz_18.png}{0078/18} + +\pic{pics/culture_c_quiz_0086_N4_validated/quiz_45.png}{0086/45} + +\pic{pics/culture_c_quiz_0078_N4_validated/quiz_37.png}{0078/37} + +Variations of ``Half Fill'' where the shapes to change have more +complex coloring. + +\end{example} + +\separator + +\begin{example} + +\pic{pics/culture_c_quiz_0078_N4_validated/quiz_30.png}{0078/30} + +Variation of ``Translate'' where the moving part is occluded, which +was never the case. + +\end{example} + +\separator + +\begin{example} + +\pic{pics/culture_c_quiz_0078_N4_validated/quiz_31.png}{0078/31} + +\pic{pics/culture_c_quiz_0084_N4_validated/quiz_10.png}{0084/10} + +\pic{pics/culture_c_quiz_0084_N4_validated/quiz_12.png}{0084/12} + +\pic{pics/culture_c_quiz_0086_N4_validated/quiz_23.png}{0086/23} + +\pic{pics/culture_c_quiz_0086_N4_validated/quiz_28.png}{0086/28} + +Variations of ``Half Fill'' with non-rectangular shapes. + +\end{example} + +\separator + +\begin{example} + +\pic{pics/culture_c_quiz_0078_N4_validated/quiz_60.png}{0078/60} + +\pic{pics/culture_c_quiz_0084_N4_validated/quiz_41.png}{0084/41} + +\pic{pics/culture_c_quiz_0084_N4_validated/quiz_49.png}{0084/49} + +\pic{pics/culture_c_quiz_0086_N4_validated/quiz_04.png}{0086/04} + +Variations of ``Half Fill'' with two colors or two rectangles have to +be modified. + +\end{example} + +\separator + +\begin{example} + +\pic{pics/culture_c_quiz_0111_N4_validated/quiz_23.png}{0111/23} + +Variation of ``Frame'' with no rectangle of adequate size to be +modified. + +\end{example} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +\pagebreak \section{Bird World} +These results were obtained with a slightly different procedure. In +particular the quizzes were validated if the models could predict both +the solution from the prompt and the prompt from the solution. We +report them since they exhibit the same patterns of generalization +although they are quite different. + +\subsection{World Quizzes} + The initial set of quizzes consist of predicting the dynamics of a very simple world: A $6 \times 8$ grid with three colored ``birds'' moving in a straight line, possibly bouncing on the grid's borders. There are ten different colors. % -\begin{center} -\includegraphics[scale=0.35]{pics/examples_train.png} -\end{center} +\birdpic{pics/examples_train.png}{} % -\vspace*{-2ex} - In each on these quizzes, $A$ is the left image serialized in raster-scan order as a sequence of $6 \times 8 = 48$ tokens, $d$ is either the token ``forward'' or the token ``backward'', and $B$ is the right image, also serialized. The direction of prediction is chosen at random. -\section{Generating Quizzes} +\subsection{Culture quizzes} -Given a set of $N$ GPTs, we can generate new quizzes as follows: -Select one of the models, and use it to generate the $97$ tokens of a -triplet $(A, d, B)$. +This procedure results in the discovery of patterns which are not +present in the original quizzes: -Then with each one of the $N-1$ other models, predict $B$ from $(A, -d)$, and $A$ from $(B, d')$ where $d'$ is the direction token opposite -of $d$. +\begin{example} -A quiz is validated if \textbf{all the other GPTs but one predict it - deterministically correctly in both directions.} +\birdpic{pics/4_birds_1.png}{} -This criterion assures that the new quizzes are both solvable and -sophisticated, and incrementally complexify the culture. Imposing both -direction prevents the generation of quizzes which are not trivial -only because the prompt has been randomly degraded. +\birdpic{pics/5_birds_1.png}{} -\section{Overall Process} +\birdpic{pics/6_birds_1.png}{} -The overall process consists of training the GPTs from scratch by -iterating the following steps: -% -\begin{itemize} +More birds. -\item select the GPT with the lowest recorded test accuracy, train it through one epoch, +\end{example} -\item if its test accuracy gets above $97.5\%$, generate $1'000$ new - quizzes, add them to the training set, re-compute the accuracy of - all the models +\separator -\end{itemize} +\begin{example} -\section{Results} +\birdpic{pics/other_shapes_2.png}{} -This procedure results in the discovery of patterns which are not -present in the original quizzes: +\birdpic{pics/other_shapes_3.png}{} -\textbf{More birds} +New bird shapes. -\begin{center} -\includegraphics[scale=0.35]{pics/4_birds_1.png} -\includegraphics[scale=0.35]{pics/5_birds_1.png} +\end{example} -\includegraphics[scale=0.35]{pics/6_birds_1.png} -\end{center} +\separator -\textbf{New bird shapes} +\begin{example} -\begin{center} +\birdpic{pics/other_shapes_1.png}{} -\includegraphics[scale=0.35]{pics/other_shapes_2.png} -\includegraphics[scale=0.35]{pics/other_shapes_3.png} -\end{center} +\birdpic{pics/occlusions_1.png}{} -\textbf{Occlusions} +Occlusions. -\begin{center} -\includegraphics[scale=0.35]{pics/other_shapes_1.png} -\includegraphics[scale=0.35]{pics/occlusions_1.png} -\end{center} +\end{example} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +\pagebreak \section{Various thoughts} @@ -198,7 +509,13 @@ present in the original quizzes: quizzes in the representation landscape of GPTs. There probably is a subtle relation between the temperature (mutation rate) and the number of models used to validate with the ``all but one'' criterion - (survival). + (survival criterion). + +\item The ``all but one'' could be ``all but K'', and there may be + some information-theoretical thing, where the goal is to maximize + mutual information, with $K=N$ being total randomness, so high + entropy but no structure, and $K=0$ is total determinism, so no + information to share. \item The setup does not push toward any specific invariance or property in the generated quizzes, their consistency is entirely due @@ -216,17 +533,25 @@ present in the original quizzes: \item This overall process probably fight the ``simplicity bias'': If a model is lacking a ``cue'' that the others have, there will - rapidly be quizzes that requires this cue, they will be added to the + rapidly be quizzes that require this cue, they will be added to the training data, and that model will catch up. \item The randomness of the process probably allow to even go beyond just synchronizing the abilities of the models. There may be some additional complexification of quizzes that get accepted by chance. -\item The current process to generate new quizzes, which simply sample - them at random is very rudimentary and probably not sufficient in a - real-data setup. It can probably be supplemented with a MCTS-type - search. +\item It can be parallelized by dispatching the GPTs across multiples + nodes, and avoiding a quadratic cost by limiting the validation of + the quizzes to a subset of them. + +\item The current process to generate new quizzes, which simply + samples them at random is very rudimentary and probably not + sufficient in a real-data setup. It can probably be supplemented + with a MCTS-type search. + +\item There may be already in the generated quizzes some structure + that \emph{we} do not pick up (e.g. certain color or motion + patterns). \end{itemize}