Brewster's Ramblings: November 2017

I've had a very long journey trying to wrap my head around this information without resorting to religious gibberish. It's also evolving so fast, I'm citing work from even just a month ago. What I mean is, I'm trying to get all the pieces here even if I can't fully articulate it.

Src: https://www.quantamagazine.org/a-new-spin-on-the-quantum-brain-20161102/

First I will point out a paper. There's this amazing work from September called "Deep Residual Networks and PDEs on Manifolds" that deals in attempting to apply the transport equation, the Hamilton-Jacobi equation, and the Fokker-Planck equations to a Deep Residual Network, or a Convolutional Neural Net with a type of optimization applied. What this could mean is you could apply bare physics to constraints on information/energy flow in these now information-fluidic neural networks and optimize the fuck out of them that way. Quantum Information Theory has hardly even been tapped yet.

Convolutional Neural Networks (CNNs) use multilayer "perceptrons" to identify graphical complexes, mainly categorical features in images. They're based off how biological visual processing is done. Dan Ciresan's work on CNNs can do things like identify cancerous cells 10 years in advance of metastasis, or help a drone learn to navigate forests on its own. ResNets are a method of improving the quality of output for CNNs with extra computational layers, say an image classifier, and presents state-of-the-art speed and capability like the new CapsNet. CapsNet is based off new findings in neuroscience about neural simplectical complexes, though the method has been around for years - only recently refined by real-world findings. CapsNet has provided state-of-the-art results in effect, which should reinforce the possibility of it being how our brains are doing it - a living organism that is trying to spend the least amount of energy in order to maintain homeostasis and survive.

ResNet

CapsNet

The main differences in ResNets vs CapsNets that I can tell are that CapsNet does not use back-propagation (accounting for new weights in the forward-layers by giving direct feedback to the earlier layers), as well the extra layers in CapsNet are nested within the main layers, whereas ResNet uses extra layers as "shortcuts" between neural layers in case there are deeper "residuals" present, or resonant features that might not be apparent in top-down analysis (see the old-school "Adaptive Resonance Theory" for elaboration). You can think of this like a glorified routing problem for data applied to different layers of statistical analysis. I'm omitting several other points worth discussing but let's just piece this together first.


"At resonance, this thing makes a lot of noise." - 2deep4me

Just putting this here for thinking material

These networks are "incorrect" with respect to Artificial Intelligence. They solve problems but they are still too rigid. To put it simple, nature seems to have a solution that merges optimal energy flow with natural structural dynamics, a process which seems to hold up evolution itself. What is then the least rigid method? Well that's where manifold analysis comes in. Manifold classification is broadly what you are doing when you are analyzing maps of images and objects. A manifold is a topological space made for analysis of anything. We map the earth to a sphere because the earth is roughly spherical, for instance, then we can triangulate distances with very high accuracy. Basically you're finding the equations that can reduce 3D space to intervals with real-valued functions. Higher dimensional manifolds essentially analyze broader scopes of phenomena, though not in the way we're used to since you use pure topology and differential equations.

Possible simplexes across brain regions. Src: Network neuroscience, Nature.com

Possible simplexes within a rat hippocampus. Src: Cliques of Neurons Bound into Cavities Provide a Missing Link between Structure and Function, Blue Brain Project

When you are dealing with unknowns, say you don't know the difference between two points on your manifold, you can adopt partial differential equations. So, say you create a simulation of melting fluids and you don't know how every point is gonna change, you provide estimates using the relationship between the main function, its dependencies, and its derivatives (i.e. rates of change for the dependencies and the main function). With manifolds, PDEs become "jet bundles," or a special type of vector bundle. Emergence of these bundles through calculation (since you can't rely on base assumption) provides local solutions for manifolds (i.e. solving for whatever given conditions). This lets you control for as many variant conditions as you imagine and gives you an abstract playing field in terms of geometries and vectors for finding solutions.

These have produced highly competitive neural network solutions in the form of "Manifold Tangent Classifiers," which seem to be relatively obscure, yet blow away the typical neural networks. The one I linked can solve the MNIST dataset (a current standard in visual neural network competitions) faster than CNNs or some of the other major competitors from 2014. I can't find more recent work, but these MTCs notably use a "maximum entropy"-based noise reduction function - one of the most essential concepts (imho) for understanding quantum interactions and probabilities in natural systems. Here's a demonstration of why: Thermodynamics, Evolution, and Behavior by Rod Swenson. Entropy and thermodynamics in general provide a unique scale-free window for observing object behavior, with scale-free (apparently like our universe) being the key term here.


From: "Scale-Free Networks: A Decade and Beyond"

Clearly there's a balance to be found. From "Modern network science of neurological disorders"

Coming back around, a favorite set for physicists: the Hamilton-Jacobi equation, is a PDEs-on-Manifold reformulation of classical mechanics (i.e. Newton's Laws) that lets you solve for conserved quantities in mechanical systems when you can't completely solve the system (e.g. how light and motion correlate). This becomes crucial when approaching quantum mechanics - where you are dealing with series of different or similar many-state systems (i.e. a complex system) that often play off each other instantaneously. That's hard to compute at our macro scales. My favorite PDEs, the Fokker-Planck set, lets you find probability distributions for multi-level phenomena like these and has incredible research backing it, implicating everything from cellular biology (the author of this paper is a cellular biologist who studied Fokker-Planck systems in cells) to the formation of the cosmos as predictable through these "control equations." That's really what you're doing, you're controlling the ordering of information in the total system to reveal something about the system itself or parts of it, which is especially useful in learning, say, painting - with all its variants based on simple physical principles about ink and brush strokes.

Convergence of expected solutions on a statistical manifold. From: "Combinatorial Optimization with Information Geometry: The Newton Method"

What I'm getting at, that the first paper I mentioned was too professional to say, is that the best learning methods will be the most universal, which is a highly non-trivial statement when you respect the history of physics and mathematics - even back to alchemy through Egypt and China. What that means in practice at this point is that (and I'm sure the academics are way ahead of me on this) the walls of these highly effective ResNets and CapsNets will come down with manifold classification (i.e. what nature does constantly and unconsciously (or 'unsupervised') via the separation of natural forces) that successfully rigs, say, the Fokker-Planck set as a radical optimizer, or a way to learn many many more models in a similar amount of data. The more granular you get with judging your analytical framework, which seems to have this real bottom-up probability physics to it when done right, the more you'll get optimization and potentially the theoretical limit of learning capacity for computer neural networks - potentially AI. That will also mean bridging QIT with the best neural network designs via exploits (or "truths") like the maximum entropy principle by my best guess. That's how D-wave does their quantum computing. Quantum discord is another relatively new concept that is flipping a lot of understandings of quantum physics on their head, something I also predict will be extremely useful in QIT-integrated learning models.

Exploring corner transfer matrices and corner tensors for the classical simulation of quantum lattice systems.

There's an amazing how-to if I ever saw one. Here's another one.


A New Spin on the Quantum Brain

Quantum computation in brain microtubules? The Penrose-Hameroff `Orch OR' model of consciousness

Getting closerrrr.

You heard it here first, friends. If this didn't come off as total gibberish that's a fucking miracle.

Edits: Trying not to link sketchy sources, this is all sketchy either way.

OK so here's a big ol' dark matter simulation of the universe.
Compare its structure to the images below.

So on the left here you have an infant motor cortex gray matter composition compared to an adult motor cortex on the right. Many of the dendrites myelinate or break down depending on what priorities are taken.

Alrighty and here's how a slime mold mapped the railways in Tokyo (the core of the mold was at the main station and food was placed where the stations were, using light at different intensities to simulate terrain (which the slime mold is more sensitive to)). It was more energy-efficient by distance than the design the engineering team made. Obviously the human-made one was also for time considerations and not just energy considerations, that's what makes the difference.

Now what am I vomiting forward here in succinct?

Firstly, our executive functions emerge out of a cellular network that in its base form is making pure energy efficiency/survival considerations. In tandem with the rest of the cells, they are able to learn to respond to greater and greater amounts of stimuli in continuation of that process. Before that was possible, nature itself by means of gravity has ordered the universe in a way that created dozens of stable elements from single protons, neutrons, and electrons (each being a combo of quarks), which then formed planets and biospheres.

Okay, enter Maxwell's demon. The idea is basically that we can exert energy to move a medium, say a gas, from equilibrium (maximum entropy) to a state of high potential (low entropy, i.e. putting all the hot gas molecules on one side of a chamber instead of them being evenly mixed throughout as they would tend to naturally, for instance, give or take some turbulence). You can be as efficient as possible by solving for the minimum amount of energy required to do that, but that is generally very difficult and time-consuming to compute as humans without massive parallel computation power to account for all the possibilities - something pretty new for us.

Here's an "information-heat engine" experiment that more or less proved the Maxwell's demon idea could help us create nanobots: https://arxiv.org/abs/1009.5287. These researchers created a little stair-stepper nanobot made of very few atoms and sent commands for it to walk up a nano-scale staircase using a minimum amount of heat as the information medium. A bi-product of being able to solve for the minimum amount of energy to operate a medium (i.e. finding the fastest, most efficient way to promote survival without getting everyone killed - i.e. the "best" method) is that you can suddenly use that metaphor to figure out how to send real information through the tiniest mediums and command even nanoscopic structures. Quantum mechanics just got an upgrade.

So the gist is, with that big ol superstructure up there, and these allegorical structures down here... what if we're the nanobots?

*hides under blanket*

Never forget your blanket.

This came out the day after I posted this, makes a nice addition:

Brewster's Ramblings

Friday, November 17, 2017

Neural Nets or PDEs on Manifolds.

Wednesday, November 15, 2017

Thought Vomit 3: Blankets, structural dynamics, and you.