A while back we had the "rotation trick" to improve VQ bottlenecks (https://x.com/sedielem/status/1863672703489634335), now we have DiVeQ, which seems to improve codebook coverage quite significantly. ... the space-filing version seems a bit like cheating though ๐ https://x.com/sedielem/status/2044805717958533395/photo/1 https://twitter.com/arnosolin/status/2044150151636238523

Seven t-SNE plots appear in a horizontal row labeled STE, EMA, RT, ST-GS, NSVQ, DiVeQ, and SF-DiVeQ with numerical values 0.012, 0.024, 0.018, N/A, 2.6ยท10^{-4}, 5ยท10^{-5}, and 3.9ยท10^{-5} above each, showing red point clouds or crosses for learned codebook C_z alongside gray points for latent P_z in most plots. The caption below reads "Figure 4: Codebook misalignment: t-SNE plots of the learned codebook C_z (red crosses) and latent P_z (gray points) representations for different VQ methods in VQ-VAE compression" with additional text referencing Sec. 4, Fig. 26, and distortion per bit.
Context: Quoting @arnosolin: "1/ ๐ฅ New paper: Differentiable Vector Quantization (DiVeQ) ๐ฅ Vector quantization (VQ) is a key building block in modern AI. It links continuous data like images and audio to discrete representations (tokens) used by transformers. @arnosolin: "1/ ๐ฅ New paper: Differentiable Vector Quantization (DiVeQ) ๐ฅ Vector quantization (VQ) is a key building block in modern AI. It links continuous data like images and audio to discrete representations (tokens) used by transformers. https://x.com/arnosolin/status/2044150151636238523/video/1" Tweet: A while back we had the "rotation trick" to improve VQ bottlenecks (@sedielem: "Better VQ-VAEs with this one weird rotation trick! I love papers like this: a simple change to an already powerful technique, that significantly improves results without introducing complexity or hyperparameters. https://t.co/E0ykEXEbwq (h/t lucidrains) https://t.co/iHTz6PpKfK" now we have DiVeQ, which seems to improve codebook coverage quite significantly. ... the space-filing version seems a bit like cheating though ๐ @sedielem: "A while back we had the "rotation trick" to improve VQ bottlenecks (https://x.com/sedielem/status/1863672703489634335), now we have DiVeQ, which seems to improve codebook coverage quite significantly. ... the space-filing version seems a bit like cheating though ๐ https://x.com/sedielem/status/2044805717958533395/photo/1 https://twitter.com/arnosolin/status/2044150151636238523" @arnosolin: "1/ ๐ฅ New paper: Differentiable Vector Quantization (DiVeQ) ๐ฅ Vector quantization (VQ) is a key building block in modern AI. It links continuous data like images and audio to discrete representations (tokens) used by transformers. https://x.com/arnosolin/status/2044150151636238523/video/1" Seven t-SNE plots appear in a horizontal row labeled STE, EMA, RT, ST-GS, NSVQ, DiVeQ, and SF-DiVeQ with numerical values 0.012, 0.024, 0.018, N/A, 2.6ยท10^{-4}, 5ยท10^{-5}, and 3.9ยท10^{-5} above each, showing red point clouds or crosses for learned codebook C_z alongside gray points for latent P_z in most plots. The caption below reads "Figure 4: Codebook misalignment: t-SNE plots of the learned codebook C_z (red crosses) and latent P_z (gray points) representations for different VQ methods in VQ-VAE compression" with additional text referencing Sec. 4, Fig. 26, and distortion per bit.
| Time | Views | Likes | Bookmarks | RTs | Replies |
|---|---|---|---|---|---|
| 11:00 AM UTC | +27 | โ | โ | โ | โ |
| 10:50 AM UTC | +48 | โ | โ | โ | โ |
| 10:40 AM UTC | +29 | โ | +1 | โ | โ |
| 10:30 AM UTC | +34 | โ | +1 | โ | โ |
| 10:20 AM UTC | +30 | +1 | โ | +1 | โ |
| 10:10 AM UTC | +19 | โ | โ | โ | โ |
| 10:00 AM UTC | +32 | โ | โ | โ | โ |
| 9:50 AM UTC | +9 | โ | +1 | โ | โ |
| 9:40 AM UTC | +50 | โ | โ | โ | โ |
| 9:30 AM UTC | +25 | โ | โ | โ | โ |