February 2021’s Gwern.net newsletter is now out; previous, January 2021 (archives). This is a summary of the revision-history RSS feed, overlapping with my Changelog & /r/gwern; brought to you by my donors on Patreon.
1 Writings
Gwern.net: popups: can now be moved, stickied, and full-screened (another step towards our ambition of Windows-95-in-the-browser!)
2 Links
2.1 AI
“Controllable Neural Text Generation”, Lilian Weng; “Recent Advances in Language Model Fine-tuning”, Sebastian Ruder (review)
“Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm”, Reynolds & McDonell 2021 (original 10-shot Fr → En translation can be beaten by the better 0-shot prompt: “French: XYZ / English:…”; this is “true of most worst-performing prompts…”); “Calibrate Before Use: Improving Few-Shot Performance of Language Models”, Zhao et al 2021 (huge boost from calibrating unstable prompts; both demonstrate, as always, that “sampling can prove the presence of knowledge but not the absence.”)
“TransGAN: Two Transformers Can Make One Strong GAN”, Jiang et al 2021 (Transformer-only GAN: attention is all you need)
“PACT: Proof Artifact Co-training for Theorem Proving with Language Models”, Han et al 2021 (GPT-f for Lean)
“Towards End-to-End In-Image Neural Machine Translation”, Mansimov et al 2020 (sure why not)
Brains:
“Artificial Neural Nets Finally Yield Clues to How Brains Learn” (short overview of biologically-plausible backprop: feedback alignment, target propagation, predictive coding, & attentional feedback; also of recent interest, VS-ML; given their increasing success in training while respecting more biological constraints, the increasing power of backprop-trained ANNs and the neurological success of ANNs in predicting & imitating brain signals, it is increasingly clear that brains really do do backprop in some sense)
“NSD: A massive 7-tesla fMRI dataset to bridge cognitive and computational neuroscience”, Jean et al 2021 (“…The availability of NSD thus opens the door to using brain activity to directly guide the optimization of deep neural networks.”)
“Brain2Pix: Fully convolutional naturalistic video reconstruction from brain activity”, Le et al 2021 (reconstructing Dr. Who)
“High-performance brain-to-text communication via imagined handwriting”, Willett et al 2020
“Brain-computer interface for generating personally attractive images”, Spape et al 2021 (many ways to improve this…)
“Scaling Laws for Transfer”, Hernandez et al 2021 (“We find that pre-training effectively multiplies the fine-tuning dataset size”; a shot across the bow of anyone floating on a proprietary-dataset moat: large models can drop data requirements by orders of magnitude overnight, even surpassing you)
“ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision”, Jia et al 2021 (see also CC-12M; CLIP-like w/EfficientNet trained on 1.8 billion images on a TPUv3-1024—DM argues that fancier cross-modal Transformers are better, nevertheless, ‘TPUs go brrr’. Given DALL·E, CLIP, ALIGN, VDVAE, CW-VAE, AIPO et al, are GANs already dead, and just don’t realize it yet? Or at least soon to be relegated to only DRL-like uses as a final finetuning phase to sharpen up a self-supervised model?); “WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training”, Huo et al 2021
“DALL·E: Zero-Shot Text-to-Image Generation”, Ramesh et al 2021 (original blog); “M6: A Chinese Multimodal Pretrainer”, Lin et al 2021 (Chinese DALL·E: 1.9TB images/0.29TB text for 10b-parameter dense/100b-parameter MoE Transformer; shockingly fast Chinese replication of DALL·E/CLIP)
“Explaining Neural Scaling Laws”, Bahri et al 2021/“Learning Curve Theory”, Hutter 2021 (Rohin Shah commentary; more on the manifold hypothesis)
2.2 Genetics
Everything Is Heritable:
“Phenotypic covariance across the entire spectrum of relatedness for 86 billion pairs of individuals”, Kemper et al 2021
“Genetic variation, brain, and intelligence differences”, Deary et al 2021
“Pathfinder: A gamified measure to integrate general cognitive ability into the biological, medical and behavioural sciences”, Malanchini et al 2021 (not the focus, but the IQ PGS is a slight improvement over Allegrini et al 2018 due to less phenotype measurement error?)
“Polygenic burden has broader impact on health, cognition, and socioeconomic outcomes than most rare and high-risk copy number variants”, Saarentaus et al 2021
Recent Evolution:
“Kin selection explains the evolution of cooperation in the gut microbiota”, Simonet & McNally 2021
Engineering:
2.3 Statistics/Meta-Science
“Lessons from Gerolamo Cardano’s The Book of My Life” (progress studies; see also Newton’s anthropic argument, Bakewell & inventing progress, The Autobiography of Benvenuto Cellini)
“How Many Microcovids Would You Spend on a Burrito?” (on the microCOVID Project Calculator)
“On the enfeeblement of mathematical skills by ‘Modern Mathematics’ and by similar soft intellectual trash in schools and universities”, Hammersley 1968 (Knuth highlights as also amusing: “A Note on Piffles”, Smith 1967; “A rebuke of A. B. Smith’s paper, ‘A Note on Piffles’”, Farlow 1980)
“Artifact and Recording Concepts in EEG”, Tatum et al 2011 (on the EEG signals of Jell-O, or, the importance of negative controls)
2.4 Politics/Religion
“The Logic of Fashion Cycles”, Acerbi et al 2012; “Fashion and art cycles are driven by counter-dominance signals of elite competition: quantitative evidence from music styles”, Klimek et al 2019; “The hipster effect: When anti-conformists all look the same”, Touboul 2019; “Right Is The New Left”, Scott Alexander (see also Han et al 2010, Downs 1972/Gupta & Jenkins-Smith 2015, Lorenz-Spreen et al 2019/Candia et al 2019, Loury 1994)
“What can we learn from the lunar pandemic that never was?” (NASA’s lunar quarantine was a sham intended to mollify the public as they covered up repeated major failures & lab leaks both before & after—had there been any dangerous lunar organisms, they would have escaped easily)
MrBeast (the new aristocracy of prestige? Borrowed plumage, perhaps, but effective…)
“Russia’s new Lysenkoism”, Kolchinsky et al 2017
2.5 Psychology/Biology
Semaglutide: “Once-Weekly Semaglutide in Adults with Overweight or Obesity”, Wilding et al 2021; “Effect of Subcutaneous Semaglutide vs Placebo as an Adjunct to Intensive Behavioral Therapy on Body Weight in Adults With Overweight or Obesity: The STEP 3 Randomized Clinical Trial”, Wadden et al 2021
A longer-acting version of the insulin/appetite peptide liraglutide, semaglutide greatly reduces weight, fat, blood sugar, cholesterol etc, with an upcoming oral version; background: Kushner et al 2020, Aroda et al 2019, Nauck & Meier 2019, O’Neil et al 2018, Blundell et al 2017, Nauck et al 2016, Lau et al 2015.
“Lessons from the host defences of bats, a unique viral reservoir”, Irving et al 2021 (bat-borne viruses; previously, Trevor Klee)
“Beneficial & Detrimental Effects of Reactive Oxygen Species on Lifespan: A Comprehensive Review of Comparative & Experimental Studies”, Shields et al 2021 (antioxidants still aren’t the fountain of youth, and may be harmful; animal studies still frequently inconsistent)
“Positive expectations predict improved mental-health outcomes linked to psychedelic microdosing”, Kaertner et al 2021 (placebo)
“The Effects of Fluoride in Drinking Water”, Aggeborn & Öhman 2021
“Sleep & Sex: What Can Go Wrong? A Review of the Literature on Sleep Related Disorders and Abnormal Sexual Behaviors & Experiences”, Schenck et al 2007
2.6 Technology
Wringing gauge blocks (“With their precisely-flat metal faces, gauge blocks can be stuck together non-magnetically via a process calling ‘wringing’, requiring substantial effort to separate. Scientists are still uncertain exactly how wringing works.”)
2.7 Economics
“Why did renewables become so cheap so fast? And what can we do to use this global opportunity for green growth?”, Max Roser (specifically, why such an extreme experience curve?)
“IQ, trading behavior, and performance”, Grinblatt et al 2012; “Genetic Endowments and Wealth Inequality”, Barth et al 2020 (why, despite notorious setbacks, did Isaac Newton & LTCM’s founders die wealthy? Why, in general, are more intelligent people so much better investors? ‘The indifference of the indicator’: it’s not one thing, it’s everything—more intelligent people have lower discount rates, save more for longer & are less risk-averse, more accurately predict future growth or inflation, are more likely to participate in +EV opportunities like the stock market, to use low-fee rather than high-fee (and thus, underperforming) mutual funds, succumb less to biases like herding as they trade better & at better times, trade less, and harvest losses more efficiently when trading poorly.)
2.8 Philosophy
Are ethics experts more ethical? “The Behavior of Ethicists”, Schwitzgebel & Rust 2016 (most recently: “The moral behavior of ethics professors: A replication-extension in German-speaking countries”, Schönegger et al 2019; given moral licensing & activism, perhaps we should be surprised we don’t hear about more ethicists doing things like posting enemy lists or trying to dox reviewers. “Woe to you Pharisees!”)
“Meta-analysis on belief in free will manipulations”, Genschow et al 2021 (another noble lie turns out to be ignoble)