Monte Carlo Petals

I finished finals today, and I’m pretty glad; it’s been a long, grueling spring semester and summer should prove one hell of a reward this time around.

After submitting my last assignment of the 2016-2017 school year, a finicky problem set for a thermodynamics course, I walked to my bus stop past a tree boasting a great deal of flowering buds—not uncommon for the season. And as I strolled by, its petals floated down and were carried by the wind, which made for a pleasant scene.

So I moved in to get a closer look and to take a picture of it. Here it is.

It doesn’t do it justice. I know people always say this, but you really had to be there.

I also ended up noticing that they landed in a very particular pattern on the ground. Basically, as you got closer to the tree, there were more petals.

And that makes sense: wind can only carry the petals so far from where they start on their branches. But there were so many petals that have been falling for so long that, as a whole, they formed a remarkably clear map of their falling distribution.

Incredibly, you could easily make out something that resembled a sort of gradient towards the center, kind of like the charge distribution of a point charge in 2D: heavy in the middle, and lighter as you get further out.

But it wasn’t this pattern the entire way through: nearing the center of the tree, the petal density only got higher up to a point, after which it started decreasing again. This also makes sense because, well, tree trunks don’t grow petals.

But it was undeniable that the pattern had a nice unique beauty to it, and so I wanted to see if I could remake it myself. I also figured that simulating the random falling of petals to get a sense of the end distribution was similar to a Monte Carlo method, which uses random sampling to solve problems in statistics. 

It ended up looking pretty awesome, and it was actually a relatively simple and short method. Here’s what I did:

Part Zero: The Game Plan


“Monte Carlo Petals” should consist of three main steps. First, we have to set the petal starting positions. Then, we need an algorithm to generate a random wind vector for each petal. Finally, we add each wind vector to the starting position and plot the results.

Let’s get this big one out of the way first: how do we generate random vector directions? If you think about it, that’s basically the backbone of this whole project. We need some way to make a set of equal-length (preferably unit-length) vectors that are distributed randomly with respect to direction. After that, we can scale the magnitudes using any probability distribution we like.

I found this online for Python, which seems to do the trick:

import numpy as np

def random_vec(dims, number):
    vecs = np.random.normal(size = (number,dims))
    mags = np.linalg.norm(vecs, axis=-1)
    return(vecs/mags[...,np.newaxis])

It seems fairly robust, too. Plotting:

import matplotlib.pyplot as plt
from matplotlib import collections
def main():
    ends = random_vec(2,1000)
    vectors = np.insert(ends[:,np.newaxis], 0, 0, axis=1)
    figure,axis = plt.subplots()
    axis.add_collection(collections.LineCollection(vectors))
    axis.axis((-1,1,-1,1))
    plt.show()
main()
x = random_vec(2,200)
plt.scatter(x[:,0],x[:,1])
plt.show()

We get this graph:

So, we’ll just use that code. Why not? A little help can speed things up tremendously.

Part One: Petal Ring


My model of petal starting positions was pretty basic; I just created a randomly distributed ring of them, modeling how the trunk and inner branches wouldn’t carry petals.

Take a random direction vector of length one and multiply it by some random value in between a larger number and a smaller number. Do this enough times, and you’ll map out a ring. The inner radius of your ring is the smaller number, and the outer radius is the larger one.

Using our random_vec() function:

Niter = 2000

startvecs = np.array(random_vec(2,Niter))
mags = np.random.uniform(low = 12, high = 15, size = Niter)
for i in range(Niter):
    startvecs[i] = startvecs[i]*mags[i]
plt.scatter(startvecs[:,0],startvecs[:,1])
plt.show()

Basically, we made 2000 “petals” in a ring with inner radius 12 and outer radius 15. It’s worth it to note that the outermost part of the ring has a lower density than the innermost, since there’s slightly more space to fill out there with essentially the same chance of a petal landing in any given place.

Part Two: Some Wind Vectors


How strong is wind, on average? Its gotta be more than zero, if there’s any wind at all. The wind blowing near the tree when I first saw it was pretty strong, but not overwhelming.

How each petal is directionally affected by wind should be pretty random as well (even if the wind itself is quite biased, it’ll change over time). We’ll start with the same random_vec() function usage as last time to make 2000 random directions. Then, we’ll multiply each direction by a length determined by a normal distribution around some set average.

This should make for a better wind model than, say, a uniform distribution, like we used for the petal ring. A normal distribution lets us make most of the wind around a set average strength, with less likely deviations from that average.

Here’s what it looks like written down:

avgwind = 5.0
winddev = 3.0

windvecs = np.array(random_vec(2,Niter))
windmags = np.random.normal(loc=avgwind,scale=winddev,size=Niter)
for i in range(Niter):
    windvecs[i] = windvecs[i]*[windmags[i]]
plt.scatter(windvecs[:,0],windvecs[:,1])
plt.show()

And the resulting graph:

You’ll notice that our average wind is low enough so that there’s still a lot of distribution around the center, especially since negative wind strength will just reverse the vector. This is intentional. We want a good portion of the petals to not move a lot, so that our final plot recognizes the starting positions well. Still, we want a lot of petals to move some too, which is why we set our average above 0.

Part Three: Putting It All Together


Now we just need to add the wind vectors to our petal ring. Since we started off using arrays, we can do this in one fell swoop, like so:

petalvecs = windvecs + startvecs
plt.scatter(petalvecs[:,0],petalvecs[:,1])
plt.show()

This produces the following graph:

Finally, I messed around with the numbers (petal number, average wind strength, wind deviation, and inner/outer ring radii), made the graph a little bigger and more square, changed the marker color/size/type, and added a background for some finishing touches.

As promised, these are our Monte Carlo Petals:

Nice. They’re not quite as visually appealing as the real thing, but I think it has its own simple beauty.

There’s still a lot more we can play around with, too.

Here’s the graph imposed on top of the (darker) starting positions:

And here’s a graph of each wind vector from its starting position to the final petal location (I reduced the petal amount, since it’s not really interesting in the form of a jumbled mass of white and magenta):

What else? We could directly plot the density as the darkness of a color, and figure out a function for the expected density value based on the input parameters.

And we could also mess around more with the variables themselves—wind deviation, average wind, petal amount, and the inner/outer ring radii—to make other pretty graphs and figure out the patterns that emerge from changing those numbers. Needless to say, there’s a lot I left unexplored.

What do you think? Have at it, and tell me how it goes.

Colorless Green Ideas Sleep Furiously

*Chomsky’s linguistics ideas and philosophy deserve textbook-sized analyses in their own right (which they have often received). If this is your first time hearing of him, do yourself a favor and dig deeper.

If you’ve heard of Noam Chomsky*, the revolutionary modern linguist and philosopher, then you might have also heard of this absurd expression he came up with: “Colorless green ideas sleep furiously.” It was designed to be an example of a sentence that’s grammatically correct yet semantically nonsensical—in other words, despite being totally valid English, there’s no discernable meaning.

Think about how bizarre that sentence is for a second and how well it works to accomplish its intended goal. “Colorless green ideas sleep furiously.” Everything in it contradicts. The two adjectives in the sentence, colorless and green, are oxymoronic, and, in their literal sense, are not words that can describe an idea. The adverb and verb also form their own oxymoron (how would you sleep furiously?) and they describe an action that an idea cannot perform. In short, there are four layers of contradiction here—impressive for a five word sentence—yet it still works in English.

But for a lot of people, it feels like it shouldn’t work. It’s an example of a category mistake. The words in the sentence establish categories for the other words that they don’t end up satisfying.

Should category mistakes be considered grammar mistakes? Does the existence of nonsense sentences like this one mean we need to reform fringe aspects of English grammar? I don’t really know, and there are many more people far more qualified than I am working on that question. But here’s an interesting, fun challenge: can you find discernable meaning in that sentence?

Finding meaning in a sentence like “Colorless green ideas sleep furiously.” might seem totally pointless, but there has been documented interest in discerning a meaning from it. Discerning a meaning doesn’t really break the original purpose of the sentence, but it does pose a fun challenge.

You could find meaning by interpreting some of the words metaphorically, or with alternate definitions. For example, “colorless” can describe things that are boring or nondescript, and “green” may be reinterpreted as jealous or environmental. You could then read it as “Boring jealous ideas have violent nightmares.” which almost makes sense.

Another, more interesting way to do this is to try to provide meaning for the sentence through context. There was actually a competition held at Stanford to provide that context in 100 words or less. Most of the entries were highly metaphorical. The winner, a poem, for example:

Thus Adam’s Eden-plot in far-off time:

Colour-rampant flowers, trees a myriad green;

Helped by God-bless’d wind and temperate clime.

The path to primate knowledge unforeseen,

He sleeps in peace at eve with Eve.

One apple later, he looks curiously

At the gardens of dichromates, in whom

colourless green ideas sleep furiously

then rage for birth each morning, until doom

Brings rainbows they at last perceive.

-A. H. Byatt

This, mind you, is absolutely gorgeous prose. But personally, that’s still a little too abstract for me. So the other day, I tried to make a context for the sentence under the pretense of using as little metaphor as possible. Here is the result:

Stacy Kyle was a climate scientist with some very bold and ambitious ideas. Energy consumption and production must be made emissions-free, she explained, while proposing the idea of slowly converting the entire country’s supply to renewables. The idea was not well-received, in part due to its feasibility and political opposition, and partly because of its unmarketable and unoriginal image. It was quickly put to rest, until a string of famous social critics reignited interest in the matter, citing Kyle’s ideas and hailing her particular rhetorical method as genius. A plurality of the population took interest, and began to ardently support the growing movement. Now under immense, fiery pressure, the opposing faction conceded the point that sometimes even colourless green ideas sleep furiously.

I slightly exceeded 100 words, but I thought it worked well to accomplish the goal: It described an environmental (green) idea; “Unmarketable and unoriginal image” provided context for “colorless”, “put to rest” was the justification for “sleep”, and “immense, fiery” and “ardently” supported “furiously.” I basically intended the last part as “Sometimes even uninspired environmental initiatives do not go quietly into the night.” which I think is pretty literal.

It’s not perfect, though—not by a long shot. It still uses some abstraction, and the sentence didn’t stand on its own; it was part of bigger one. Can we do better?

If by “better,” you mean more literal, then yes. It’s more or less a function of however specific/contrived we allow the context to be.

With a little more leeway, let’s try this once again:

“Astounding,” said Dr. Glover, marvelling at the sight ahead. “What do you call it?”

“How about… the Prismatic Brainalyzer.”

His doctoral student, Orville Wheeler, was a prodigy in the cognitive neuroscience of dreams. And he had proven as much—he created the incredible invention now sitting in front of them.

“That’s very fitting,” agreed the doctor, “So, explain to me once again how it works.”

Orville was excited to show off his chops. “The… Prismatic Brainalyzer… analyzes the patterns of a sleeping patient’s brain. It first implants an idea into their head to develop. Then it assigns each of them a “color” value, which corresponds to the emotional weight and type of their dream, as well as the clout it occupies in their mental space.”

“Interesting.”

“Yes. It also gives an intensity reading, which you can see on this chart on the machine.” Orville pointed at the chart and then hesitated. “…We don’t have a technical term yet, but I like to label high intensity, from around 700-900 lumens, as ‘colorful.’”

“And the low intensity ones?”

“Colorless.”

Dr. Glover paused while thinking, and lightly paced the laboratory. “So what do different colors tell you about how implanted ideas are developing in the dreams?”

“See, that’s the thing. We can only make indirect observations about what they mean. It’s very clear that different colors correspond to different mental states, but finding out the exact meaning of each color and intensity takes time. We first have to find patients to implant ideas in, get them to sleep, and then question them about their dreams. And still, they don’t always remember…”

“I see. Well, what do we know so far?”

“Patients have described colorful yellow ideas as blissful”

“And?”

“Colorful red ideas tend to sleep fastidiously.”

“Anything else?”

“Not much. Colorless green ideas sleep furiously.”

“Astounding,” said the doctor, now more intrigued than ever, “Continue research with patients. I’ll work on getting us more funding. This could be huge.”

Orville smiled. “Yes, it could.”

A little outlandish, but it’s fun—par for the course with sci-fi.

Writing story contexts like these reminds me of those jokes where you read through a lengthy, detailed story, and then it ends up all being a setup for some awful one-liner (i.e. a “Shaggy Dog Story“). Except instead of a pop-culture reference or bad pun, we have our weird Chomsky sentence to end it. And I think writing stories for this has its own added challenge, since the sentence was designed to be nonsensical.

Seriously, imagine how impressive it would be to read one where had no clue what it was doing all the way up until seeing “Colorless green ideas sleep furiously.” Go make some crazy, contrived contexts yourself, and be sure to enjoy writing cool stories along the way, too.

As always: Have at it, and tell me how it goes.

The Madelung Constant Finder

Note: This is a follow-up to my post on the Madelung constant (it’s for NaCl). If you don’t know what that is yet, take a look at it here.

I wrote the Madelung constant finder code in Python, so you’ll need some interpreter if you want to run it alongside this explanation.

If you don’t have it already, I recommend downloading the Anaconda distribution for Python. It comes preloaded with a bunch of useful packages (NumPy, SciPy, Matplotlib, etc.) and Jupyter, an IDE good for demonstrating what your code can do in an organized and runnable environment. I’m uploading the Jupyter notebook (234 downloads ) of this code as well as the raw .py scripts (248 downloads ) – I highly recommend the Jupyter option if you want to follow along closely.

Coordinate Generator

The first thing I did was establish the best logical framework for computing the constant. In my opinion, the best option is to generate a list of lists—with each nested list representing a “triplet” coordinate set—and then calculate the influence for each (Not like real triplets, since the things inside aren’t identical. I just call them that in my syntax, so I’ll use the name here too for consistency). You might be quick to point out the potential for some shorter, more elegant, and vectorised solutions here, but the triplet list method is suprisingly fast (especially if you only generate a few and multiply, as I demonstrate) and it carries the benefit of being really easy to understand just from looking at the code.

So, how do you generate a list of all possible 3-d coordinate sets? You don’t. Well, you can, but it’s slower. You’ll see what I’m talking about soon.

My friend whipped up a script for generating coordinate sets, which I refined to stop at certain shells (you just end it when the maximum number in a set is past your desired shell number). It looks like this:

while (triplets[-1][0 and 1 and 2]) < shell_number:
    lasttrip = (triplets[-1])
    if lasttrip[0] == lasttrip[1]:
         if lasttrip[1] == lasttrip[2]:
             triplets.append([0, 0, lasttrip[2]+1])
         elif lasttrip[1] < lasttrip[2]:
             triplets.append([0, lasttrip[1]+1, lasttrip[2]])
    elif lasttrip[0] < lasttrip[1] and lasttrip[1] <= lasttrip[2]:
         triplets.append([lasttrip[0]+1, lasttrip[1], lasttrip[2]])

The first line makes it loop through the rest until any of the coordinates in the last generated set are more than the shell number. From there, it goes through a bunch of conditional statements that determine what the next coordinate set should be based on the last generated one—sometimes adding one to a term, other times flipping a term to zero and adding one, or rarely, flipping the whole thing (like from [1,1,1] to [0,0,2]).

You may notice something strange about this code. Namely, that it doesn’t actually generate every possible coordinate set. If you run this with, say, shell_number = 3, you’ll get this as the output into triplets (to run this snippet yourself, you’ll first need to set triplets as a list of lists, with its first term as [0,0,1]):

[0, 0, 1]
[0, 1, 1]
[1, 1, 1]
[0, 0, 2]
[0, 1, 2]
[1, 1, 2]
[0, 2, 2]
[1, 2, 2]
[2, 2, 2]
[0, 0, 3]
[0, 1, 3]
[1, 1, 3]
[0, 2, 3]
[1, 2, 3]
[2, 2, 3]
[0, 3, 3]
[1, 3, 3]
[2, 3, 3]
[3, 3, 3]
...

They sort of just count up, as if it were some kind of weird binary on the first shell, trinary on the second, and so on. This, of course, misses all the negative numbers and position combinations of numbers in between. But we can solve this by multiplying the influence of each triplet by the amount of atoms that would be at the same distance. Since direction doesn’t matter (only distance and amount), it’s effectively the same thing.

Equal Distance Set Generator

The logic tree for doing this is pretty long, but it’s also pretty interesting. In code form, it looks like this:

for i in range(len(triplets)):
    coordset=triplets[i]
    if coordset[0] == 0:
        if coordset[1] == 0:
            eqdist = 6
        elif coordset[1] == coordset[2]:
            eqdist = 12
        else:
            eqdist = 24
    elif coordset[0] == coordset[1]:
        if coordset[1] == coordset[2]:
            eqdist = 8
        else:
            eqdist = 24
    else:
        if coordset[1] == coordset[2]:
            eqdist = 24
        else:
            eqdist = 48
    eqdistset.append(eqdist)

This makes a list the same length as triplets that’s filled with the number of atoms equidistant to its corresponding triplets entry.

This list is also the number of atoms on each sphere of increasing radii (I touched on this sequence a bit in my original post. It’s also on the online encyclopedia of integer sequences). It was pretty fun to work out how many atoms were at the same distance based on some general facts about the particular coordinate set. If you have time, I would highly recommend trying it out yourself for fun. It’ll also give you more insight into the weirdness of that sequence.

Denominator Generator

Diagram showing the fractional atom concept
Some help with visualization

I also wrote in my last post that, in order to converge faster, we needed to account for the fraction of the atom in the shell. To picture how this works, imagine each shell as the surface of a cube with its corners centered on an atom. The atoms at those corners only have 1/8 on the inside of the cube, the edge atoms have 1/4, and face atoms 1/2.

To write code for this, I first thought you would need to add the fraction of each layer and then store the remainder to add onto the next shell up. But if you really think about, you only need to fractionalize the last layer. For everything in between, it’s as if you’ve already added both the inside and outside parts when you just don’t do anything to them.

This code generates a list of denominators for the fraction in the shell of the corresponding atom in triplets:

for i in range(len(triplets)):
    coordset = triplets[i]
    if coordset[0] == coordset[1]:
        if coordset[1] == coordset[2]:
            denom = 8
        else:
            denom = 2
    elif coordset[1] == coordset[2]:
        denom = 4
    else:
        denom = 2
    denomlist.append(denom)

Short and sweet. It’s just checking if a set represents a corner, edge, or face atom and then appending a value to denomlist accordingly.

Putting it All Together

Now let’s do the sum. We’ll generate a list of addends based on the other lists we made. The generating algorithm needs to determine if the coordinate set represents a sodium or chlorine atom so we know whether to add or subtract (a simple way to do this is to add the coordinate values together—if it’s odd then the atom is opposite the reference, even being the same). It then needs to determine if the coordinate set represents an atom on the edge of the shell, so it knows whether or not to use denomlist.

Finding the distances is a trivial task with numpy:

triplets = np.array(triplets)
distset = np.sqrt(np.sum(triplets**2,axis = 1))

And so the final generator looks like this:

for i in range(len(triplets)):
    coordset=triplets[i]
    if abs(coordset[0]+coordset[1]+coordset[2])%2 == 1:
        if max(coordset) == shell_number:
            addlist.append(eqdistset[i]/(denomlist[i]*distset[i]))
        else:
            addlist.append(eqdistset[i]/distset[i])
    else:
        if max(coordset) == shell_number:
            addlist.append(-eqdistset[i]/(denomlist[i]*distset[i]))
        else:
            addlist.append(-eqdistset[i]/distset[i])

Now we just need to extract our data.

print("Final calculated Madelung constant for NaCl is:", sum(addlist))

And we’re done!

Ranking the Word Endings

What’s the best word ending for maximizing the number of single letters that could go before it? I made a list. So far it looks like -at is the winner here, and it’s pretty easy to see why: Bat, Cat, Fat, Hat, Mat, and so on. And those only make up the common ones. Have you ever heard of a “Jat?” Me neither, until fairly recently. But it’s definitely a real word.

In total -at can form words found in the Oxford English Dictionary using 20/26 letters of the alphabet. Abbreviations and proper nouns are excluded, of course.

11 of those letters plus -at form what I would call “common” words. There’s no strict definition here yet, but you can think of them as words that the average educated English-speaker could reasonably define without having to look it up.

In 2nd place is -it, followed closely by -ar in 3rd, -ag in 4th, and -an in 5th. (only including the endings I’ve looked up). -it and -ar both make words with exactly 18/26 letters, but since -it has more common word formations – 10 vs -ar’s 9 – I place it above.

I don’t think 3- or 4-letter word endings really have a chance to top the best 2-letter ones, but I’m definitely still looking. 1-letter word endings come close, -o in particular actually forms 18 words with single letter prefixes, but its common word count is so pitiful I didn’t want to include it in my top 5.

The holy grail of this search is to find an ending that can make common words with more than half the letters of the alphabet. I really want to find this for three main reasons:

  1. So I can safely say that there’s a word ending so common in English that most letters make words with it (Do you realize how funny that sounds?), and I don’t think it’s enough to just have a bunch of really obscure words making up that listing.
  2. This may even have implications beyond being just a “fun fact.” It could be related to the biolinguistics of why humans prefer certain sounds over others. We could potentially uncover some interesting patterns among the top word endings.
  3. I want to be stupidly good at Scrabble/Words with Friends.

In other words, I want to find a word ending where at least 14/26 letters combine with it to make a common word. Before I go about trying to find that, though, we should probably first come up with a formal definition for “commonness.”

A formal definition that works and could be widely agreed upon would be really helpful. With that, you could, for example, write a program that cycles through some possible word endings and finds common ones really quickly.

I can’t just use something obvious and easily interfaceable, like the amount of google search results that pop up for a given word. Uncommon words can be common search results.

As an extreme example, “wat” has become a sort of meme – the image of that strange old lady making an inquisitive face (If you don’t know what I’m talking about, go ahead and google “wat.” You’ve probably seen it before). So this form of “wat” as a comical misspelling of “what” has been used quite a bit on the internet. But I wouldn’t consider wat, as in –

/wät/: noun (in Thailand, Cambodia, and Laos) a Buddhist monastery or temple.

– to be a common word. Still, its meme status lets it turn up way more search results than “vat” which I would consider much more common in most cases.

You could spend time figuring out a systematic method for filtering out meme-related results and whatnot, but I still think search results are just an inaccurate measure in general for what we really want from them: the actual volume of words used in English. Oh, what to do?

Rant over, I guess. I’m all out of ideas. Hopefully that’s some good food for thought for today. Have at it, and tell me how it goes.

Fargo and Reverse Role Models

Movies and Role Models

I saw Fargo the other day (now one of my all-time favorites), and while the movie was full of enduring moments, one scene in particular—one that many actually seem to find rather inexplicable—is still on my mind a good while after. And before you ask, it isn’t the infamous woodchipper scene, nor the heartfelt ending speech from Frances McDormand (though it was a tour de force performance). It’s the scene that leads up to the bloody climax, the one where Marge meets up with an old friend who’s obviously still enamored with her—that one seems to stick.

In fact, whenever I see a character (generally in movies) who’s intentionally written with some very human but very clear flaws, their image tends to stay with me for a while after. Basically, I’m left thinking: “Well, gee, I don’t want to be like them.” It’s stronger if I’ve felt something close to what the character felt once, or at the very least could imagine myself in their shoes (as ever, good acting helps). I don’t think I’m the only one who does this, and I also think a lot of good movie plots actually rely on the idea of a character representing one or many negative traits that can be reflected in the average person.

And I think this idea of avoiding certain aspects of a flawed character (and the opposing, standard notion of taking on those of a good character) is an important part of the value of media. More so in movies, which can show not just the negative things but the subtle human tendencies that accompany them. I like to call these types of characters “reverse role models,” since the idea is opposite to a standard role model in that you’re essentially trying to avoid some aspect of them.

Role Models and the Reverse

Depending on how you look at it, Fargo can be full of reverse role models. Take a look at its pseudo-protagonist, Jerry: he’s not very good at being a criminal, and you can tell by his little tendencies how difficult it is for him to keep up an ongoing double lie—one to his father-in-law about the amount they’re asking for ransom and another to the hired kidnappers on how much the father is actually paying. In the end, hubris and greed do him in.

The kidnappers themselves are also a good example. Buscemi’s character is a mousy, nervous man, rough around the edges with anger issues. His partner, Grimsrud, is a play on the silent type, aloof but quickly turning to violence when the opportunity arises. Everything they do, from their day-to-day interactions with people to when they’re actually out committing crimes, reveals their negative traits in what feels like a visceral, raw manner, even though the acting is quite subtle. The slow but firm lecturing Marge gives towards the end further adds to the feeling of the movie setting an example.

An important factor linking all these characters is that they’re all believable; Their motivations are within reason for any human and so part of the movie’s job is to show us where they took things too far. And since they have that added believability factor, you can see how in some situation you might be the one making the same poor choices.

But again, of all the characters in this movie designed with believable flaws that fail them in spectacular ways, I still feel worse about the awkward guy than the serial murderer. Or at the very least, I’m left thinking about his actions more. Why? Well, he hits closer to home. And I think for most people, being in an awkward situation is a lot more common than killing (though the motivation, greed, is common enough). No one even considers likening themselves to the guy stuffing a body into a woodchipper, but it’s easy to put yourself into a similarly awkward situation and see all the little ways that you might also give away that you feel uncomfortable. The movie reflects what you were or could be when things don’t really go your way.

So, far from the stories of grandeur that tend to create our standard role models, the best reverse role models aren’t overblown, dramatic evils but everyday people with failures like you and I. And it makes sense that the dynamics would work out this way; if you’re aiming to be more like someone else, they better be at the top of their respective game. But a proper warning shouldn’t come from someone so far below you that you don’t even consider having their issues. To be effective reverse role models, they need to embody the problems that you could potentially have. In my opinion, the best movies often have characters/actors that do this very well.

Reverse Role Models and You

But I think the most interesting part of it is that we don’t often do this with real people. In movies, you can sometimes feel what the characters feel, and indeed many factors are in place in a good movie to make sure that you do. You can’t not participate in feeling. It’s just you face-to-face with raw awkwardness (or whatever other bad trait they’re highlighting). But when we see a real person doing something dumb, it’s usually easier to just avoid thinking about the circumstances that got them there and how they might feel in their current situation. We certainly don’t learn from them with the same ease that we tend to have with movie characters. Humans may be the only creatures truly capable of empathy, but that doesn’t mean we’re good at it.

So why is it easier to connect with most fictional characters than real people? I have a pet theory, and it’s based on two tenets: First, when you invest time into reading a book or watching a film, you’ve already assigned value to the characters in it, more or less equal to amount of time you spent viewing their respective work. On top of that, good fiction is carefully crafted to make its characters feel real. All the important characters’ stories are either shown to you directly or strongly implied. Real people don’t get this luxury with you.

Lady stalking on facebook, probably failing to capture the person as an effective reverse role model.
“It’s not facebook stalking! It’s internet reserach.”

So people tend to infer. And that almost never produces good results. To test it out, go look at a total stranger’s online profile. Seriously, do it. Look up a random name on Facebook or Instagram or LinkedIn or something. Really dig in there, and pay attention to all the little details. You’ll probably find yourself trying to work out all of their (usually negative) characteristics. Perhaps they have a picture of themselves that supposedly reveals some part of their personality. Maybe they have a job that you deem only for a certain type of person. It’s possible you even feel immense second-hand embarrassment in accord to something you believe they’ve done, or for the type of person they’ve become. The Germans have a nice little word for that. It’s called Fremdschämen (or if you feel good about it, Schadenfreude).

Pictures generally seem to be (but aren’t actually) the most revealing parts of a social profile. Under certain contexts, pictures can make people look boring, interesting, narcissistic, timid, smart, outgoing, vapid, etc. And it feels very instinctual to categorize them that way. When people say “first impressions matter” they often forget how important it is you look the part, too.

Sometimes, when we spot negative things about people we don’t know well, we also like to think “they should just do x, and their lives would be so much better/easier.” Even if that were true, it’s hard to imagine the different factors (often psychological) that prevent them from making that decision.

The Proposal

Now here’s the fun part: Given all the assumptions made about that random person, really try and consider how valid they would all be if you actually met them. In other words, using your best judgement, try to give a rough estimate on the percentage of your predictions that would actually come true. Also consider the amount of characteristics that person would probably have that you never would have suspected.

It might feel weird to consider a stranger fully now. As in, they’ve now become a real human with needs and desires, leading a life as complex as any other (vitally, including your own). The dictionary of obscure sorrows recently coined a term for that weird feeling: sonder.

If I did something like this, I have a pretty good guess on what my conclusion would be. Based on the people I’ve made friends with, who’ve often proven my first impressions of them wrong, time and time again, I’d say the percentage is low and the amount is high, respectively. And I’d argue it’s basically the same for most people, too.

So, keeping all that in mind, I have a proposal: Let’s try and consider other people the way we already do with fictional characters—with perspective.

Madelung – The Realest Abstraction

If you’ve done any physics work before, you might have noticed that the formulas tend to include a lot of constants: the speed of light, Planck’s constant, the Bohr magneton, electron/proton/neutron masses, and so on. It makes sense that we would need to use constants, since it would be pretty odd/coincidental if the relationships defined between real, physical quantities would be some pleasant numbers in our decimal system. Unlike in mathematics, physics constants are generally real things you have to measure and apply models to in order to calculate.

So, they’re not usually defined abstractly in the same way that pi or e are. Though there are still a few useful constants in physics that have abstract definitions like in pure mathematics. One of those constants is the Madelung constant—what I have fittingly dubbed “the realest abstraction”—and it’s pretty damn cool.

*Mostly known for his unethical treatment of cats.

The Madelung constant, named after Erwin Madelung (not to be confused with the other, more famous Erwin in physics)*, is used to determine the total amount of electrical potential for an ion in a given crystal lattice. If that sounds bloated, don’t worry—the exact physical interpretation won’t be important in our discussion, but you can basically think of it as the answer to this question:

Assuming an infinite structure (so whatever pattern the atoms take on just continues on forever) and approximating atoms as point charges (so any weird charge distribution is ignored), what’s the total effect from electrical forces on a single atom by all the others in the structure?

One important thing to note is that this value converges. In other words, if I start summing the effects of each atom individually and go outwards from the center by distance, the sum will tend towards a specific value (the Madelung constant). Since the effect of any single atom falls off (though not exponentially) as you increase the distance, this should make some intuitive sense.

Another interesting property of the constant is that it’s unitless—a pure maths number. In practice, it’s intended to be multiplied by some combination of the electric charge and atomic distance, but you can think of the constant itself as a fundamental property of a crystal’s structure, or even a fundamental maths constant. You’ll see why this is a good description soon.

For the crystal of NaCl (also known as salt), there are two Madelung constants—you get different values if you use sodium (Na) or chlorine (Cl) as the reference atom (otherwise, the constant will always be the same). Since the two types of atoms occupy positions in a pattern that maintains some level of symmetry if you start switching between the two, the effects of each are the same magnitude and differ only by a sign.

Here’s what it looks like. Notice how each layer forms its own “checkerboard.”

The NaCl crystal has a very simple pattern, which makes it an ideal example for this. It occupies a cube structure where sodium and chlorine atoms switch off as you move across adjacently. You can think of it like a checkerboard that extends infinitely, with Na placed where the white squares are and Cl on the black ones. Add another layer by placing another checkerboard on top of the one you already have, except shifted one space over. Keep adding layers, and pretty soon you’ll have the lattice we’re looking for.

To simplify things, before I show you the calculation, let’s set the charges of Na and Cl to be one fold and opposite to one another, so that the charge of Na is just 1 and the charge of Cl is -1. Let’s also set the distance between the reference atom and its nearest neighbors—the ones just adjacent to it on our checkerboard pattern (there are 6 in total)—as a distance of 1 away.

With all those assumptions, the formula for finding the Madelung constant of NaCl looks something like this:

It’s stuffed, but I’ll try to explain each part: M is the Madelung constant, and the subscripts represent that constant from either the reference atom Na or Cl (remember how the charges were reversed). The summation goes from negative infinity to infinity for (j,k,l), hitting every combination of the three along the way. You can think of (j,k,l) as the three coordinates describing a particular atom’s position in the lattice (this means (j,k,l) from negative to positive infinity will describe every possible atom). The origin (0,0,0) is our reference atom, (0,0,1) would be one of the 6 nearest neighbors, and so on (if you’ve done anything with 3d coordinates, it’s literally the exact same thing).

You might have also noticed that there’s a way to tell if the atom we’re looking at is sodium or chlorine just by looking at its coordinates: add them all together—if it’s an even number, it’ll be the same atom as the reference/origin, and odds are the other type. If you consider how the nature of this checkerboard pattern works in your head, it should start to be clear exactly why that works.

With that in mind, we can understand the numerator, which represents the electrical “effect”—It’ll work out to be positive if the atoms are the same and negative if they’re different. Lastly, the denominator is just the distance, with larger distances giving smaller effects.

So what happens when you actually do the sum? It depends on how you sum it, and this is where things get really interesting. There are two ways to do it that make the most intuitive sense, and I’ll describe them both here.

One way is to add them up like spheres of increasing radii. Add the 6 nearest neighbors, then the next closest set ((0,1,1) and the other atoms at distance \sqrt{2}), and so on. The sum would then be -\frac{6}{1} (the 6 nearest neighbors at distance 1) + \frac{12}{\sqrt{2}} (there are 12 next-closest atoms distance \sqrt{2} apart) – \frac{8}{\sqrt{3}} (the 8 “corners” of the first 3x3x3) + \frac{6}{2} (similar to the nearest neighbors but one out) and so on.

There are some really interesting properties to this summing pattern (OEIS #A005875):

  1. The number of atoms at each distance going outwards follows a peculiar sequence: 6, 12, 8, 6 (the first four already described), then 24, 24, 12, 30, 24, 24, 8, 24, 48, 6, 48, 36, 24, 24, 48, 24, 24, 30, 72, and so on…?
  2. It’s especially weird when you consider that the number of atoms at each distance is the same as the number of equidistant points from a cubic center, which seems like something pretty abstract/purely mathematical.
  3. This pattern is equivalent to the number of ways of writing a nonnegative integer n as a sum of 3 squares when 0 is allowed. For example, n=1 can be written as 1 or -1 squared in any of the three square places with the other two as zero, giving 6 unique ways (With some effort, you can figure out why that works to give you the right pattern).

And that’s already a fairly interesting result from seemingly unassuming beginnings.

The red line follows the resulting Madelung Constant if you sum it using the sphere method. Look at how unpredictable and strange the trend is (The blue line is the cube method, which I’ll describe soon).

But here’s the real kicker: doing the sum this way doesn’t actually get you the right constant. In fact, it won’t even get you a constant—it doesn’t converge. And I don’t mean that in the sense that it will tend towards infinity or negative infinity, which would be boring but somewhat understandable. It doesn’t converge in the sense that as you increase the distance, it just sums to random values around the actual Madelung constant that never seem to get any closer (though taking the average of those fluctuations over a long period can work, albeit slowly).

You might have already realized why that’s really weird: As you get further away, the distance increases, and the effect of any individual atom is lessened. This should really be lending itself to converging. You might have noticed something else, though: While distance increases as you get further away, meaning each individual atom has a lower effect, so does the amount of atoms.

There are just generally more atoms at further distances, a fact you can pick up on just by picturing the cubic lattice. Still, the value doesn’t even want to go towards either infinity, so this means that the distance and atom increases somehow “balance out” in the sum, creating a sort of self-regulating parity. This is even more surprising when you consider that every other atom impacts the origin in an opposite manner, which should add to the difficulty of a potential balancing act.

It also makes the simplicity of the next summing method surprising: Sum using expanding “cubes” instead of spheres, taking all the atoms in the 3x3x3 cube, then all the additional atoms you add in the 5x5x5 “shell” surrounding, then the 7x7x7 and so on, and it converges almost instantly. For NaCl, the value comes out to be about ±1.748 (depending on if you used cholrine or sodium as the reference).

As a side note, it converges even faster if you only take the “fraction” of each atom that’s in the current shell. In other words, “face” atoms are 1/2 inside or outside (and so you add only half the value until the next shell), atoms on the edge are either 1/4 or 3/4, and corners count for 1/8 or 7/8. I’ll probably post some code for this soon (edit: it’s posted).

I really do think this is amazing, and I may just be scratching the surface. If I could, I’d do my thesis on this (though apparently, someone else already did a fairly exhaustive analysis).

So what other weird and interesting properties of the Madelung constant can you come up with? Have at it, and tell me how it goes.

The Analog Elevator Brain

I remember watching a video a while back that poked fun at how uncomfortable people were with the idea of a machine having control over their immediate physical surroundings. It was used as a lead-in for the topic of self-driving cars, but before that they brought up some interesting historical analogies.

When elevators first switched from human operators (back then there were actual people hired to operate elevators) to automatic user-operation, a lot of folks didn’t like the fact that there wasn’t a person there controlling the movement anymore. Elevators were a new and scary technology with lots of potential risks, and even if automated control was verifiably safer, people really wanted other people to control those risks.

Interestingly, the buttons usually didn't do anything. If pressed, the speakers played something alike "you have pulled the emergency stop. If this is not an emergency, please push it back in. If this is an emergency, please use the phone."
Feel any safer?

So, elevator engineers added some choice features to make people feel safer while riding (while not actually making them any safer): big, red emergency “stop” buttons, an in-elevator telephone, and even speakers offering soothing, pre-recorded human voice lines—“Hello. Which floor would you like to go to?”

Comparing this initial scare—which, of course, now feels almost silly—to the reservations that some have for self driving cars today is questionable (for one, the number of dimensions of free movement of a car vs. an elevator—imagine a self-driving airplane!), but it inspired this thought: Isn’t it weird that the technology required to generate and harness enough energy to move a giant, heavy metal container (and people inside) against Earth’s gravity came about before the technology required to figure out where it should go when it’s doing its job?

I think most people would say the former sounds more impressive, at least when they first hear it. But on second inspection, you would probably realize that a basic mechanics operation—regardless of how much energy it might need—will be, at the very least, figured out with much less mental effort than something complicated like machine logic.

Please don't comment anything about rope elevators.
*Evidently, something like this would not count.

And as it turns out, even an apparently simple operation like sending an elevator to the right floor based on the combination of requests and buttons pressed is pretty difficult to do automatically. Industrial Era powered elevators* were brought into mass use in the early 19th century, but completely automated ones weren’t available until almost a century after (it took a few more decades for people to widely accept them).

The first totally automatic elevators implemented something called relay logic, which was basically a predecessor of the transistor-based logic we use today for computing. These circuits did all the work a standard human operator would’ve done before they came along—detecting which floor the elevator was on, sending it to the correct floor based on its requests, positioning the elevator level with each floor, and opening and closing the doors when it arrived.

But of course, before we got the awesome technology of automatic elevators, humanity had to wait until all the necessary advancements required to make good circuits were made, right? Maybe. I would say that a sufficiently advanced/well-designed analog system could match many of the things a digital system might do (after all, analog computers—limited by the noise of a continuous process rather than discrete truncation—have been around for a while and have certainly proven their usefulness). At the very least, this definitely has the potential to be extended to an elevator “brain” (hence our title).

So here’s the challenge: Using only pre- or early Industrial Era technology, design a working analog elevator brain that can perform or outperform all the relevant tasks that a modern elevator control system does.

I admit solving the challenge is limited to a few people with very specific domain expertise. This is just a thought experiment.

If you recall that original thought (the weirdness of elevators being developed long before their control systems), you might realize that this challenge sets out to prove its premise: that automatic elevator control systems could’ve been developed earlier, without the need for any circuitry. After all, the energy required to perform these logic operations will probably be orders of magnitude less than the amount required to move the elevator itself—we only need to have a sufficiently advanced system.

You may have also realized that it totally makes sense that elevator control systems would be developed after the invention of a powered elevator, since there’s no need to control something that doesn’t exist yet—but to that I present two counterpoints:

First, a particularly forward-looking human could have come up with the design well before if it were possible, perhaps for another technology (remember the energy requirements for each). Second, a century of development is still a long time, and we can always try and prove that this could’ve happened much sooner.

To be clear, using electrical signals for this challenge is fine, but you would not be able to rely on any sort of processing through printed circuits. But before you set out to make your flowchart/schematic/blueprint/engineering draft, here’s a general set of three features I think a design should need for it to be considered fully automatic:

  1. It needs a selector (Taken from elevatorpedia. And, before you ask—yes, there’s a wiki dedicated solely to elevators and their design), or some method to accomplish the same task of detecting which floor the elevator is on.
  2. It needs to include a mechanism that will somehow stop it level to each floor, open both sets of doors, and release it when it needs to move again.
  3. Most importantly, it will need some kind of brain to process this information along with the requests to output the correct floor option (if you could only design one of these features, it should be this one)

—the “correct” floor option can mean a lot of different things. Here are the most common elevator floor-choosing algorithms, but any method should be fine as long as it makes sense:

  • First obey all the commands of those inside the elevator, until checking any external requests and then returning to the ground floor (not common in commercial elevators).
  • Perform “sweeping” operations, moving up until the highest floor request is met, then turning around and moving downwards until the lowest floor request is met (what you’re probably used to).
  • Bonus points for finding an analog implementation of more efficient algorithms.

If you think about it, you’re mostly just designing a basic computer made to do a specific task using really old parts. If that’s difficult, you can assume the computer itself is accounted for and just find a way to use the outputs of a computer to move an elevator and put inputs back in.

So there you have it—all the tools necessary to check if Industrial Era engineers were fools for not having made automatic elevators sooner. Have at it, and tell me how it goes.

Navigation