The Scientific Method in Proficiology

In a sense, studying generic sentient beings is kind of like studying a fictional universe. There’s no obvious notion of accuracy or correctness – after all, a generic mind is not necessarily an accurate depiction of human psychology. The whole field seems kind of self-contained and any statement about the world of the generics frankly appears unfalsifiable, in the same way that theories about a fictional universe are impossible to disprove through experiment.

Why should we care about such fictional domains? When there are high stakes to explain or understand a phenomenon, we look toward science for answers. It is tempting to treat scientific concepts as concrete things that objectively exist, making scientific theories seem much more tightly grounded in reality than the aforementioned theories around fictional universes. But I argue that in many cases, the two are not all that different. Of course, I am not saying that scientific work is purely fictional. All bodies of science agree that statements must be tested against real-world observations in order to be taken seriously. However, there are many different ways to do scientific work, and by grouping these methods under a common name we are desensitizing ourselves to their differences. In pointing out these differences, I will show you how scientific theory can be more fantastical than we initially expect.

Science-p: The Study of Evidence-based Procedure

Let’s start with the driest, least fantastical way in which science can be done. Suppose that we expect to frequently encounter some familiar scenario, and that we have an explicit goal to achieve in that scenario. In other words, we have a problem statement – each encounter with this scenario is an instance of our problem, and we must try to achieve our goal for each problem instance to the best of our ability. We generally do what I will call science-p when we research into the problem hoping to improve our ability to solve it. To explain science-p, we should first understand the main challenges of trying to solve our problem:

The methods used to achieve our goal can be quite involved, requiring many complicated steps. We need to somehow learn about such methods in the first place.
There can be many hidden ways in which the problem instances can differ. We need to reliably detect these differences so that we use the correct methods to solve them.
Our methods do not always work every time, and can differ in the extent and dimension in which they solve our problems. We likely need to try our methods many times in order to assess their performance.

In science-p, observation is king and above all else our focus is on trying to solve our problem. In a research paper on a science-p subject, the experimental setup / results / analysis sections are the most important. Exactly why the result is achieved is of second-class importance – for science-p, a method that demonstrably and consistently achieves great results across a variety of problem instances is a good one, even if researchers are not very sure how it works.

This type of work is common in medicine¹, so let’s use that field as an example. Suppose we know about one treatment procedure for a disease – perhaps some plant was commonly used as a folk remedy for the disease, and early experiments show that it does indeed improve the disease outcome in a measurable way². It is by far not perfect though – the treatment effect is pretty mild and some people do not respond to the medication at all. Science-p research can be done to identify whether there are differences in the symptoms or demographics of people who are affected less by the medication. Researchers can try changing how we administer the medication to increase its potency. They can try to extract compounds from the plant to see which ones are responsible for the improvement, and try creating other methods of administering the same compound. Throughout the process, the community creates new methods, detailed measurement procedures, and an extensive taxonomy of disease instances / treatment outcomes / side effects. But this research doesn’t help predict what happens when you do something substantially different from the standard procedure, so science-p researchers must rely on other forms of science in order to come up with creative new approaches for the problem.

Science-u: The Quest for a Universal Model

In science-u we no longer have an explicit problem to solve. A theory from science-u is designed so it can be applied on an arbitrary setup that has never been studied before. One of the main goals of science-u is to create a model that can predict to high precision the outcome of any setup from a universe of possibilities (which can contain hypothetical scenarios that are difficult if not impossible to engineer with current technology).

The creation of a model here is crucial – generally there are infinitely many setups in our universe and we need a way to predict an outcome in each case. To use science-u to predict the outcome of a setup, we first translate the setup into our model in a way that makes the observable properties of the setup match our specifications. Then we apply the model and try to find other properties that can be measured by observation. The science-u predictions are precisely the outcomes of those observations. We just need to run an experiment where we create the appropriate setup, make our observations, and compare against the predicted results.

In science-u, it is important that the setups are defined as precisely as possible. Problem instances from science-p are expected to vary considerably, but we don’t want a given setup from science-u to contain hidden variations. Experiments must try to engineer these setups with very precise equipment in order to properly test the predictive accuracy of a science-u theory. Science-u work is common in physics³, a field which uses some of our most advanced equipment for experimental work.

A second important goal of science-u is to create models that are as simple as possible. It’s possible to fit observational data with complex curves built out of thousands of parameters, and build a theory saying that reality fits our complex curve. But it would be highly unsatisfying to science-u research, even if the complex curve does create accurate predictions. I believe part of the reason is because we have a limited budget for designing and running science-u experiments, so we need to be smart in what experiments we choose to test. We need to come up with setups that are different from the setups of past experiments but are still possible to replicate with our technology, and we use our knowledge of the theories themselves to come up with such experiments. This is hard to do with a complex theory which only works due to the fine tuning of thousands of parameters.

Science-a: Using Models for Approximation

Sometimes, a field of scientific research attempts to study an existing real-world environment that is so complicated that we cannot really hope for a universal model. Similar to science-u, the primary goal of science-a is to make accurate predictions about setups from the environment. However, we can’t even observe all the nuances of the setups we care about, let alone faithfully replicate these setups in experiments. Instead, science-a focuses on the creation of many different models, all of which work on a simplification of the environment and can be used to make imprecise predictions for a given setup.

Science-a models are much more fallible than their science-u counterparts, so it is important for researchers to recognize the limitations of these models and know when to choose one model over another. Even with such precautionary steps, science-a work can be much less reliable than science-p or science-u work. For example, science-a is commonly used in economics and cognitive / social psychology⁴, and we constantly find economists disagreeing on what effects a government policy will have, or psychologists disagreeing on why we see certain tendencies in human behavior. By no means is science-a useless though. Having multiple science-a models can help scientists study an environment from many different perspectives, and by having discussions over these models the depth of the community’s understanding of the environment improves over time. Doing science-a research not only helps create more accurate science-a models, it also reveals previously unstudied nuances of the setups, thereby helping researchers create more precise research questions in the future.

That said, I think there is a tendency for science-a to perform deliberate underfitting, where the community believes they will never fully understand the nuances of their setups and decides to focus on the most common behaviors or partition their setups across fuzzy boundaries. For example, psychological and economic models have an implicit understanding that some people or groups behave differently from what the model predicts, but are generally happy with successfully modeling the behavior of the majority. They may also choose to focus on a specific culture or government style, even though people / governments from different categories may be similar and people / governments can vary considerably within one category. These kinds of simplifications help advance the field as a whole, but leave us unable to make sense of the exceptional setups within the environments we study⁵.

Scientific Models as Fantasy

In science-p, we don’t really try to make models. We might borrow models from science-u or science-a to help us come up with better methods for our problem, but we have no need to predict what happens in any scenario other than the one that occurs in our problem instances. We can talk purely in concrete terms – our methods, our problem instances, and how well we are achieving our goal.

In science-u, we have a universal model that tries to predict everything correctly. Typically when we have multiple models, they do not agree and sooner or later we will run an experiment that disproves one model in favor of another. It is tempting to believe that there is only The One True Model that is basically synonymous with objective reality, but even with our harsh requirements we can end up multiple models which predict the exact same behaviors. For example, there are several equivalent interpretations of quantum physics, which appear wildly different but ultimately produce the same predictions⁶. Gravity is usually described as a “force acting at a distance”, but it is equally if not more accurate to describe gravity as the curvature of a 4D surface. Both the invisible force and the 4D surface are fantasies – they exist only within the model, not as actual “things” in the universe.

In science-a, we are aware of the fact that we are using models, and that they are mere simplifications of a complicated environment. Each model describes a fantasy world of simplified setups and outcomes, and we only ask for these models to give fairly accurate predictions in some situations, i.e. the majority of setups within a certain category. There is less of a tendency to elevate the model to objective existence, but it still happens all the time. For example, to some the existence of a person’s subconscious mind / memories can seem just as objective as the existence of their eyes / brain.

Science-s: Spanning all Possibilities with Customized Models

I consider proficiology to be applying a different kind of scientific approach, which I will call science-s. In science-s, we have multiple possible environments of real-world setups. Each environment fits inside a universe which contains both naturally occuring and hypothetical setups. The outcome of a given setup can depend on the environment we are working under, so it doesn’t make sense to talk about predicting the outcome of the setup. But science-s is not interested in making this kind of prediction anyway. Instead of evaluating based on predictive accuracy, science-s uses a plausibility filter, a function which determines whether a given theory produces “acceptable” outcomes for all setups within a universe. We assume the plausibility filter is already given to us, and that we can use the filter to judge the plausibility of a theory even on hypothetical scenarios.

The criterion of “acceptability” is more lenient than that of predictive accuracy. In fact, there can be infinitely many theories which are deemed plausible. Science-s focuses on creating not just one plausible theory, but a customizable theory template which can produce many plausible theories. Similar to science-u, this theory template is based on a model, albeit one with configurable parameters. The primary goal of science-s is to create a theory template which can generate a plausible theory consistent with the observed outcomes of real-world setups from one environment; the secondary goal is to make the models as simple as possible. We can evaluate the quality of a theory template with the following procedure:

Choose some environment of interest. It should already contain a history of past setups with their corresponding final outcomes.
Generate a theory from the theory template which fits as closely as possible with the outcomes of past setups. In other words, use the historical outcomes as “training data” for a customized model.
Run the theory through the plausibility filter to get a plausibility score.
Repeat the above process several times with many different environments, and judge the quality of the theory template by the plausibility scores & the closeness to which the generated theories fit the past setups.

In the case of proficiology, the environments are the lives of real people or fictional characters, and the setups are the events that have happened to them & their responses. A universe may contain both actual events and fictional / counterfactual events, even though the latter do not appear as “training data”. Our theories are descriptions of how a character will respond to arbitrary events, and we create a plausibility filter by having judges evaluate whether a theory describes the behavior of an intelligent / sentient being, i.e. whether we think it is plausible that a sentient being will behave in the way described by the theory. We use a theory template to generate custom models and fit the models to a character’s known actions. We subject these models through a myriad of possibly fictional events to see how they respond, and we show these results to judges asking whether the models are behaving appropriately as sentient beings.

An Argument for Science-s Research

Despite its strong emphasis on hypothetical setups and customized models that are demonstrably not objective / universal, I believe science-s should be adopted as a legitimate research methodology because science-s research is a productive way of studying the full span of possibilities for poorly understood and highly varied phenomena such as sentience:

A science-p approach is overly pragmatic; focusing purely on solving familiar problem instances does little to reveal the broader patterns of the phenomenon of interest, and leaves us unprepared for future / less familiar issues.
A science-u approach is impossible if we cannot make reproducible observations or precisely engineer specific setups. We would require in-depth understanding and control over the factors that are causing the phenomenon of interest to vary.
A science-a approach tends to oversimplify and makes no attempt to explore variations of the phenomenon beyond what is observed in reality. Without a theory template, we are forced to shoehorn the full range of possibilities into a finite set of theories, and even if we legitimately didn’t care about uncommon cases we would have trouble understanding non-static environments that slowly drift toward unexplored possibilities.

Science-s recognizes that there are many variations of the phenomenon we want to study, and gives us a framework to create customized theories covering any variation, even hypothetical ones that potentially could appear in a future environment. But of course, science-s has its limitations too. It may be hard to find a good plausibility filter for the phenomenon – judges may disagree in their assessment of plausibility, and over time people may develop different opinions about the topic. In addition, even if we can create highly consistent and plausible theories, we won’t necessarily be able to predict the outcomes of future setups from the environment. This can happen if the phenomenon contains so much variation that we need an unrealistic amount of “training data” in order to predict future outcomes. But much like how science-a can be useful even if its predictions can be unreliable, I believe that even with imperfect plausibility filters and an inability to make predictions, science-s work can advance our understanding of real-world environments and the phenomenon of interest as a whole.

Proficiology takes a science-s approach to study the full span of sentient thought, including the cognition of people who are different from the majority in many ways. By using generic minds containing experience foundations / lenses / agents to model sentient thought, we can produce custom models that are consistent with a person’s past behavior and can respond plausibly to future events. Studying this model can deepen our understanding of the person’s behavior, even if the model doesn’t accurately predict their future behavior. Creating, evaluating, and attempting to simplify entire theory templates helps us understand sentience as a whole, which is great considering that we may be on the cusp of developing artificial sentience with AI technology.

Footnotes

It is also common in applied sciences, abnormal psychology, and deep learning research.
As for how our ancestors discovered this method in an era that predates evidence-based medicine, well people will try random things and find some truths (and a lot of myths) through trial and error.
In fact most science-u work is physics, which to be fair is a very broad field of study. But I think there are examples of non-physics science-u work in biochemistry and the Earth sciences.
It is also used in ecology, evolutionary biology, linguistics, sociology etc.
Note that deliberate underfitting is unacceptable in science-u. A science-u theory is expected to be universal, and any exception is a weakness of the theory.
In essence the theories are:
- There is only one universe which is fundamentally built out of probability distributions and has no objective properties (Copenhagen interpretation)
- There is a vast multiverse of branching possibilities of which we only observe one branch (Multiverse interpretation)
- There is only one universe which objectively contains both matter and waves, where matter moves by riding on top of the waves (Pilot wave interpretation)
Like the gravity example in the next sentence, the probability distributions / multiverse / pilot waves don’t actually exist as concrete “things”; they are only used as parts of the model.

Category: Proficiology

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

The Proficiologist