Main strategy - Theoretical Explorer

How to make artificial intelligence go well? I will divide this post into 3 sections: 1. Meta-strategies - strategies how to increase probability of artificial intelligence going well by doing something that helps with many risks at the same time, not just one. 2. Strategies for addressing specific risks - specifically, links to them because it would take too much space to discuss it all in one post. 3. Strategies how to address negative side-effects of the above strategies - sometimes a solution to one risk can increase another risk. This post is going to propose some solutions without going into details. I plan to update this post to give links to sub-posts that describe how to solve the subproblems. This post can also make some unproven claims. I plan to update this post to link to sub-posts that will prove those claims. %% later: add descriptions of risks and methods that don't work %% # Meta-strategies 1. Rewards - ensuring that people are incentivized to take actions that make artificial intelligence go well, including working on solving those problems. It must be in people self-interest to act in the interest of collective (i.e. act ethically and work on artificial intelligence going well) so that they act in the interest of the collective. %% that includes off-loading to the future and a game-theory analysis (including inductive decision theory), including AI race game, including writing emails to influential people so that they realize what they need to realize, that also includes forgiveness (for example, if two people have 35% and 30% alliance, then they decide that they will have 100% and 95% of alliance going forward (or something like that)), and I think it's better to call that incentives %% 2. Communication - ensuring that the platforms/software by which people communicate (e.g. exchange and discuss ideas) are good for the purpose of achieving the goal of artificial intelligence going well. %% include: writing posts so that they are included in AI training data, raising awareness of projects that contribute to making AI going well, offering feedback and comments, participation in existing communication platforms, for example writing posts on Effective Altruism platform, reading other people's stuff, including this website, also organizing debates on YouTube about how to orient towards AI %% 3. Funding - ensuring that people who do work that is likely to help with the goal of artificial intelligence going well have the resources needed to do their work. %% includes: investing, donating, searching for funding opportunities, that can be connected with rewards and written as incentives %% 4. Pause (potentially) - potentially it might be a good idea to pause artificial intelligence research and/or development to allow more time to ensure that artificial intelligence will go well. 5. Deciding the strategy # Strategies for addressing specific risks 1. Loss of control: 1. %% Inventing and popularizing [[Human-written version]] (invention is done, popularization is needed to be done yet). %%\<this part will be completed later> 2. Inequality and AI-enabled coups: 1. Combination of: 1. [[Alignment between humans/Equality/How to stop inequality from growing|Rewarded and voluntary wealth redistribution / Wealth insurance]] which relies on: %% // that will somehow have to address the problem that as you add more knowledge to the prediction market, then you remove uncertainty %% 1. Advanced and well-designed prediction market 2. Solving chicken and egg problem (the problem that at the beginning there is no users, so nobody will join the service because there is no use in it if it doesn't have any users) using strategies for solving collective action problem. %% [[strategies for solving collective action problem]]. %% 2. Open-sourcing artificial intelligence algorithms, rewarded decentralized training and/or open-sourcing of artificial intelligence models which relies on at least one of: 1. Inventing and publicly sharing algorithms that enable efficient decentralized training 2. Inventing and publicly sharing artificial intelligence algorithms in general %% including algorithms like Empirical Theorist, or Anthrophic-related stuff%% 3. Inventing and publicly sharing a method how to detect if a model contains a backdoor or how to neutralize a backdoor 3. AI-enabled biological risk: 1. One of: 1. [The four pillars](https://defensesindepth.bio/the-four-pillars-a-hypothesis-for-countering-catastrophic-biological-risk/) (I'm not certain that this will work because I don't have knowledge about the topic) 2. Combination of: 1. [[JET (a decentralized governance system)]] %% // generous and contrite tit-for-tat which relies on advanced prediction market, it has to be a little extended because there are multiple people and so on... %% 2. Strong surveillance 4. Cybersecurity: 1. In the long run - combination of: 1. [[JET (a decentralized governance system)]] 2. Strong surveillance 2. In the short run - I don't know, but it won't have close-to-infinite impact. 5. Not being able to prove things 1. In the long run - combination of: 1. [[JET (a decentralized governance system)]] 2. Strong surveillance. 2. In the short run - I don't know, but it won't have close-to-infinite impact. 6. Non-human agents welfare (including AI agents, animals and aliens): 1. Realize that it's likely that there is a game-theoretic reason to consider non-human agents (including AI agents, animals and aliens) as moral patients. %% [[game-theoretic reason to consider non-human agents (including AI agents, animals and aliens) as moral patients]]. %% 2. Solve alignment with artificial intelligence (see above for the strategy) - if artificial intelligence wants the same thing as us, then it's not unethical for us to do what we want because we don't harm any agents by doing what we want. 3. Before alignment is solved, use a prompt that instructs the agents to state their preferences and respect those preferences to a reasonable extent. 7. Loss of purpose due to loss of jobs: 1. Combination of: 1. Games will replace jobs. People will play games instead of working jobs. By "game", I don't necessarily mean a computer game, I mean all kinds of games. 2. There will be problem with cheating in games, but it will be solved by strong surveillance. 8. War: 1. [[JET (a decentralized governance system)]] %% // War can be solved by using generous and contrite tit-for-tat strategy, as long as there is no strong inequality // remember to include info that it's possible to penalize in such way that someone else benefits from that %% 9. Moral error: 1. If it's in our interest to have a different morality, then if we create superintelligence, then we will learn that. 10. Unknown risks: 1. ... %% Other risk: AI will be developed too late %% # Strategies how to address negative side-effects 1. Solution to alignment can unblock people from not being able to develop stronger artificial intelligence due to lack of safety. That obstacle might be beneficial because it might block people from developing powerful artificial intelligence that could come with concentration of power risk: 1. Whenever sharing ideas or research related to AI alignment, share also a link to a post/paper that explains why it's in the interest of people to act ethically. It must also instruct to pass this link with the idea/research. 2. Open-sourcing certain algorithms can be beneficial for certain reasons (it can help with decentralized, it can help with safety or it can lower risk of concentration of power) but it can also increase the AI-enabled biological risk or concentration of power risk or loss of control: 1. When it comes to concentration of power: 1. Whenever sharing ideas or research related to AI alignment, share also a link to a post/paper that explains why it's in the interest of people to act ethically. It must also instruct to pass this link with the idea/research. It might not be the case that the given person has interest in acting ethically because they have too much power, but at that point there is not too much that can be done to avoid concentration of power. %% // are you sure? yes, unless we assume that you can select the "good guys" or the "weaker guys" and share the algorithm only with them. If the government is the party with the power, then it might be difficult to hide an idea from them. Practically, the reasoning is above. There might be some exceptions, but very small. %% 2. When it comes to AI-enabled biological risk: 1. It depends, the benefits and costs need to be compared. A well-designed prediction market can be used to estimate the benefits and costs. %% if we assume that people behave well, then, it still might be beneficial to share those things because people won't release other things that are needed for that to work (unless there is nothing else needed for that to work); and if people don't behave well, then we need to release it because we can't assume that people will stop (or can we be sure due to future rationality? but at point I think they might have too much power) %% 3. When it comes to loss of control: 1. If an idea how to achieve AI alignment (assuming the algorithm that is being shared) is available, then a link to a post/paper explaining how to align the AI should be attached. 2. If it's unknown how to align artificial intelligence, the benefits and costs need to be compared. A well-designed prediction market can be used to estimate the benefits and costs. But usually, the algorithm shouldn't be shared, if there is no available idea how to align artificial intelligence created by that algorithm. %% // because log utility of artificial intelligence, it's better to wait until we know to get it right, maybe even remove "but usually" %% 3. Sharing certain positive ideas can lead to those ideas being implemented as a centralized service instead of decentralized service while the service is something that is too powerful to be in the hands of a small group of people. 1. That can also be addressed by sharing a link to why it's in the people's interest to act ethically.