descriptionA minimal implementation of a GPT-like model.
last changeSat, 21 Jan 2023 15:32:53 +0000 (16:32 +0100)
shortlog
2023-01-21 François FleuretAdded default configurations and reformated with black. master
2022-12-03 François FleuretUpdate.
2022-12-03 François FleuretUpdate.
2022-08-27 Francois FleuretThe "mask" array actually specifies what attention...
2022-08-20 Francois FleuretReplaced --synthesis_sampling with --deterministic_synt...
2022-08-08 Francois FleuretAdded args.learning_rate_end for an exponential decay.
2022-08-08 Francois FleuretAdded the small weight embedding + id layer norm inits.
2022-08-07 Francois FleuretAdded the rng state in the checkpoint.
2022-08-07 Francois FleuretAdded the small-weight embedding initialization.
2022-07-30 Francois FleuretUpdate.
2022-07-30 Francois FleuretUpdate.
2022-07-30 Francois FleuretUpdate.
2022-07-30 Francois FleuretFixed a bug when there are no squares.
2022-07-29 Francois FleuretOCDC
2022-07-29 Francois FleuretUpdate.
2022-07-29 Francois FleuretUpdate.
...
heads
15 months ago master