1
0
mirror of https://github.com/osmarks/nanogpt-experiments.git synced 2024-11-10 20:09:58 +00:00
nanogpt-experiments/data
リョウゼ be571fff2c
Improve readability of huge numbers
Before:
  length of dataset in characters:  1115394
  all the unique characters: 
   !$&',-.3:;?ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
  vocab size: 65
  train has 1003854 tokens
  val has 111540 tokens

After:
  length of dataset in characters: 1,115,394
  all the unique characters: 
   !$&',-.3:;?ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
  vocab size: 65
  train has 1,003,854 tokens
  val has 111,540 tokens
2023-01-16 22:05:32 +01:00
..
openwebtext candidate changes to apis, have to think through more 2023-01-01 01:29:48 +00:00
shakespeare candidate changes to apis, have to think through more 2023-01-01 01:29:48 +00:00
shakespeare_char Improve readability of huge numbers 2023-01-16 22:05:32 +01:00