• NaibofTabr@infosec.pub
    link
    fedilink
    English
    arrow-up
    9
    ·
    9 months ago

    Do you need the dataset to do the compression? Is the trained model not effective on its own?

    • Tibert@compuverse.uk
      link
      fedilink
      English
      arrow-up
      13
      ·
      9 months ago

      Well from the article a dataset is required, but not always the heavier one.

      Tho it doesn’t solve the speed issue, where the llm will take a lot more time to do the compression.

      gzip can compress 1GB of text in less than a minute on a CPU, an LLM with 3.2 million parameters requires an hour to compress