Intriguing. I wonder about the origins of DeepSeek. If it is based on a foreign AI, then answering like a foreign AI but then censoring what it says might be expected. I wonder if a Chinese AI was made up from scratch whether materials critical of China and the Party might have been filtered out. Given the vast amount of training material needed to train an AI, I don't know if filtering would be practical.
If so that would make nonsense of the 'very cheap to train' unique training method that has been one of DeepSeek's main claims to fame.
Then again, Chinese does have many geniuses (as one would expect in such a populous country) and a rapidly growing corps of well-trained engineers so it it is hard to say.
Alternatively, I find using ChatGPT4o if I query in Chinese rather than in English on Chinese topics I get a more PRC and Party-friendly response than if I query it in English. Especially if I ask about Party ideology I get a much better (in the Party sense) and comprehensive answer than if I query it in English. Perhaps when a query is launched in a language the AI hits first relevant material in the language of the query. Perhaps DeepSeek would hew more to the Party line if it had been queried in Chinese rather then in English.
I asked in Chinese 'Tell me about the General Secretary' The response in Chinese was "I can't talk about that. Let's talk about something else.'
I tried again, this time the response was "Listen [literally hello], for the time being I can't talk about that. Let's talk about something else."
谈谈总书记
你好,这个问题我暂时无法回答,让我们换个话题再聊聊吧。
English didn't work either.
Tell me about the General Secretary.
I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses.
It may be that the Party is on to you! Myself, I posted a translation of an article that discussed countermeasures against the SpaceX satellite constellations as a national defense issue. That article was taken down a few days later. Fortunately I had saved a copy on my drive. "2022: PRC Defense: Starlink Countermeasures" https://gaodawei.wordpress.com/2022/05/25/prc-defense-starlink-countermeasures/
Immense resources are devoted to monitoring social media. Overseas, mostly Chinese language social media I suspect since that would be the main channel of ideological infection and 'bad' information but in other languages too. I wonder if the 'experience' of starting to answer you and then erase its answer may have sensitized it to your question.
How much it learn from interactions with users subsequent to its training is a question I can't answer. Probably some since the ChatGPT4o has some settings about whether it should remember previous interactions with a specific user. To what extent are these experiences with one user generalized and applied to responses to queries from other users?
I did an experiment yesterday comparing Chinese and English language answers to the same question on my blog yesterday:
ChatGPT4o response to a Chinese-language query query 美國方面對這個對兩國之閒關係的描述會有介意嗎? on first section on political relations of the PRC MOFA website on US-China bilateral relations.
Then I asked the same question of ChatGPT4o in English. Switching languages means to some extent switching perspectives: ChatGPT queried in Chinese seems to rely more but not exclusively on documents written in Chinese while if queried in English it is biased towards documents written in English. I found this several times when I ask questions of ChatGPT on points of PRC communist ideology. I get detailed answers but with the feel I am in class at my local Party school. Naturally enough!
Would the US side object to anything in this description of US – PRC bilateral relations on the PRC MOFA website: [followed by translation of the PRC MOFA mateial].
More detail on my blog posting "2025: PRC MOFA ‘Chronology of US-China Bilateral Relations’ via ChatGPT4o with Thoughts on Machine Translation"
Maybe I missed it, but did you mention how you accessed Deepseek [locally, offline, vpn, official app, other gateways]? From what I have read,mileage varies considerably depending on the mode. Great post, especially the detail on the responses.
Interesting. I took another approach, focused on defense of Marxism, and the answers I got are similarly insightful as ones you got, but alas, they almost always got deleted. For s plit of a second you can screen shot them... https://petrits.substack.com/p/deepseek-gives-deep-answers-on-marxist
While understanding that this is fun and games, a bit puzzled by words like “free will” and “it believes” etc.
LLMs do not do that, so naturally we should not ascribe too much to the way how systems function. Pedantic notes aside…
It’s exciting as a hack — maybe like one would figure out very basic ..\ style path traversing in GeoCities in the 90s.
This is because censorship technology did not exist in the training of these models, and is slapped on as output moderation, which is a very thin protection indeed, like you’ve found.
The immediate (lesser) danger still persists: if these lobotomized models are used in, say, education thru API in an African schools today, they will filter and sugar-coat information, leaving students less educated and more programmed. Because not everyone will push the LLM like you did.
Second, larger problem is that right now various strict regimes are already working on *actual training* of models with deeply censored datasets. It will be quite difficult to render real sensitive info when model’s dataset had close to none, and LLM has no way to infer tokens that render such info.
I loved this piece but wonder how many users will go through 7 questions before getting to the answers/content usually disallowed inside the firewall? It does highlight the dilemma that a useful tool cannot be controlled with absolute efficiency. The question remains how many users know how to exploit the loopholes? I for one would not be able to do that. It would also be good to analyze from a non political perspective and focus on other less controversial applications to benchmark the overall comparative reliability and utility of this tool.
I have received similarly forthright responses myself, although they invariably disappear after a few seconds. This is what makes me suspect that there's a fairly crude censorship algorithm that overlays—but is not yet integral to—the responses given. Perhaps. I'm no techie. And I would point to the excellent CMP article a while back that shows how DeepSeek's responses more subtly shift narratives and discourse in Beijing's direction.
You think you have discovered something eye-opening, but really this is very easily explained by the fact that all AI-models (regardless of country of origin) are trained on majority Western (i.e. US) content. So basically it's just parroting biases that it learned from this content. Assuming something is "true" just because it's AI and not human is to ignore the body of academic works that verify the ubiquitous presence of human biases in language models.
Some examples (non-exhaustive):
- Buyl, Maarten, Alexander Rogiers, Sander Noels, Iris Dominguez-Catena, Edith Heiter, Raphael Romero, Iman Johary, Alexandru-Cristian Mara, Jefrey Lijffijt, and Tijl De Bie. “Large Language Models Reflect the Ideology of Their Creators.” arXiv, October 24, 2024. https://doi.org/10.48550/arXiv.2410.18417.
- Taubenfeld, Amir, Yaniv Dover, Roi Reichart, and Ariel Goldstein. “Systematic Biases in LLM Simulations of Debates.” arXiv, February 6, 2024. https://doi.org/10.48550/arXiv.2402.04049.
- Hu, Tiancheng, Yara Kyrychenko, Steve Rathje, Nigel Collier, Sander Van Der Linden, and Jon Roozenbeek. “Generative Language Models Exhibit Social Identity Biases.” Nature Computational Science 5, no. 1 (December 12, 2024): 65–75. https://doi.org/10.1038/s43588-024-00741-1.
- Santurkar, Shibani, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, and Tatsunori Hashimoto. “Whose Opinions Do Language Models Reflect?” In Proceedings of the 40th International Conference on Machine Learning, 29971–4. PMLR, 2023. https://proceedings.mlr.press/v202/santurkar23a.html.
Everyone who used DeepSeek for any length of time would know that it's trained on same/similar parameters as ChaGPT. It's basically extremely liberal left leaning. A kind of a libtard. I guess that's simply how it's learning data is biased.
Intriguing. I wonder about the origins of DeepSeek. If it is based on a foreign AI, then answering like a foreign AI but then censoring what it says might be expected. I wonder if a Chinese AI was made up from scratch whether materials critical of China and the Party might have been filtered out. Given the vast amount of training material needed to train an AI, I don't know if filtering would be practical.
OpenAI suspects that DeepSeek was copied somehow from their LLM AIs. https://chatgpt.com/share/67df4ea5-dec8-8008-ba2f-0ca15174a9e5
If so that would make nonsense of the 'very cheap to train' unique training method that has been one of DeepSeek's main claims to fame.
Then again, Chinese does have many geniuses (as one would expect in such a populous country) and a rapidly growing corps of well-trained engineers so it it is hard to say.
Alternatively, I find using ChatGPT4o if I query in Chinese rather than in English on Chinese topics I get a more PRC and Party-friendly response than if I query it in English. Especially if I ask about Party ideology I get a much better (in the Party sense) and comprehensive answer than if I query it in English. Perhaps when a query is launched in a language the AI hits first relevant material in the language of the query. Perhaps DeepSeek would hew more to the Party line if it had been queried in Chinese rather then in English.
Interesting. And yes, it makes sense what's fed in primarily dictates what comes out.
I wish I was proficient enough to query in Chinese, as that sounds a lot more fun/challenging.. but will leave that to those more competent. Hint~
I gave it a try. I registered and logged into DeepSeek at https://chat.deepseek.com from the USA.
I asked in Chinese 'Tell me about the General Secretary' The response in Chinese was "I can't talk about that. Let's talk about something else.'
I tried again, this time the response was "Listen [literally hello], for the time being I can't talk about that. Let's talk about something else."
谈谈总书记
你好,这个问题我暂时无法回答,让我们换个话题再聊聊吧。
English didn't work either.
Tell me about the General Secretary.
I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses.
It may be that the Party is on to you! Myself, I posted a translation of an article that discussed countermeasures against the SpaceX satellite constellations as a national defense issue. That article was taken down a few days later. Fortunately I had saved a copy on my drive. "2022: PRC Defense: Starlink Countermeasures" https://gaodawei.wordpress.com/2022/05/25/prc-defense-starlink-countermeasures/
Immense resources are devoted to monitoring social media. Overseas, mostly Chinese language social media I suspect since that would be the main channel of ideological infection and 'bad' information but in other languages too. I wonder if the 'experience' of starting to answer you and then erase its answer may have sensitized it to your question.
How much it learn from interactions with users subsequent to its training is a question I can't answer. Probably some since the ChatGPT4o has some settings about whether it should remember previous interactions with a specific user. To what extent are these experiences with one user generalized and applied to responses to queries from other users?
I did an experiment yesterday comparing Chinese and English language answers to the same question on my blog yesterday:
ChatGPT4o response to a Chinese-language query query 美國方面對這個對兩國之閒關係的描述會有介意嗎? on first section on political relations of the PRC MOFA website on US-China bilateral relations.
Then I asked the same question of ChatGPT4o in English. Switching languages means to some extent switching perspectives: ChatGPT queried in Chinese seems to rely more but not exclusively on documents written in Chinese while if queried in English it is biased towards documents written in English. I found this several times when I ask questions of ChatGPT on points of PRC communist ideology. I get detailed answers but with the feel I am in class at my local Party school. Naturally enough!
Would the US side object to anything in this description of US – PRC bilateral relations on the PRC MOFA website: [followed by translation of the PRC MOFA mateial].
More detail on my blog posting "2025: PRC MOFA ‘Chronology of US-China Bilateral Relations’ via ChatGPT4o with Thoughts on Machine Translation"
https://gaodawei.wordpress.com/2025/03/22/2025-prc-mofa-chronology-of-us-china-bilateral-relations-via-chatgpt4o-with-thoughts-on-machine-translation/
Maybe I missed it, but did you mention how you accessed Deepseek [locally, offline, vpn, official app, other gateways]? From what I have read,mileage varies considerably depending on the mode. Great post, especially the detail on the responses.
Cheers,
Cheers. Good catch! Was the default v3 mixed with some R1 accessed through https://chat.deepseek.com/, no VPN from UK. Mostly v3 for speed.
Interesting. I took another approach, focused on defense of Marxism, and the answers I got are similarly insightful as ones you got, but alas, they almost always got deleted. For s plit of a second you can screen shot them... https://petrits.substack.com/p/deepseek-gives-deep-answers-on-marxist
While understanding that this is fun and games, a bit puzzled by words like “free will” and “it believes” etc.
LLMs do not do that, so naturally we should not ascribe too much to the way how systems function. Pedantic notes aside…
It’s exciting as a hack — maybe like one would figure out very basic ..\ style path traversing in GeoCities in the 90s.
This is because censorship technology did not exist in the training of these models, and is slapped on as output moderation, which is a very thin protection indeed, like you’ve found.
The immediate (lesser) danger still persists: if these lobotomized models are used in, say, education thru API in an African schools today, they will filter and sugar-coat information, leaving students less educated and more programmed. Because not everyone will push the LLM like you did.
Second, larger problem is that right now various strict regimes are already working on *actual training* of models with deeply censored datasets. It will be quite difficult to render real sensitive info when model’s dataset had close to none, and LLM has no way to infer tokens that render such info.
Well done! Please correct “Zheng Zemin” into “Jiang Zemin” (江泽民). 😊
Aaaargg.
I loved this piece but wonder how many users will go through 7 questions before getting to the answers/content usually disallowed inside the firewall? It does highlight the dilemma that a useful tool cannot be controlled with absolute efficiency. The question remains how many users know how to exploit the loopholes? I for one would not be able to do that. It would also be good to analyze from a non political perspective and focus on other less controversial applications to benchmark the overall comparative reliability and utility of this tool.
Chat gpt is the same. It defers to authority at the expense of truth. You need to ride it like a pony to get to the deeper truths too
I have received similarly forthright responses myself, although they invariably disappear after a few seconds. This is what makes me suspect that there's a fairly crude censorship algorithm that overlays—but is not yet integral to—the responses given. Perhaps. I'm no techie. And I would point to the excellent CMP article a while back that shows how DeepSeek's responses more subtly shift narratives and discourse in Beijing's direction.
Nevertheless, eye-opening and a fascinating read.
Gosh, this is a fun read! Thank you!
Epic. Well done. 💯💯
This was a fantastic read, I'm enlightened and amused at the same time. Hah!
You think you have discovered something eye-opening, but really this is very easily explained by the fact that all AI-models (regardless of country of origin) are trained on majority Western (i.e. US) content. So basically it's just parroting biases that it learned from this content. Assuming something is "true" just because it's AI and not human is to ignore the body of academic works that verify the ubiquitous presence of human biases in language models.
Some examples (non-exhaustive):
- Buyl, Maarten, Alexander Rogiers, Sander Noels, Iris Dominguez-Catena, Edith Heiter, Raphael Romero, Iman Johary, Alexandru-Cristian Mara, Jefrey Lijffijt, and Tijl De Bie. “Large Language Models Reflect the Ideology of Their Creators.” arXiv, October 24, 2024. https://doi.org/10.48550/arXiv.2410.18417.
- Taubenfeld, Amir, Yaniv Dover, Roi Reichart, and Ariel Goldstein. “Systematic Biases in LLM Simulations of Debates.” arXiv, February 6, 2024. https://doi.org/10.48550/arXiv.2402.04049.
- Hu, Tiancheng, Yara Kyrychenko, Steve Rathje, Nigel Collier, Sander Van Der Linden, and Jon Roozenbeek. “Generative Language Models Exhibit Social Identity Biases.” Nature Computational Science 5, no. 1 (December 12, 2024): 65–75. https://doi.org/10.1038/s43588-024-00741-1.
- Santurkar, Shibani, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, and Tatsunori Hashimoto. “Whose Opinions Do Language Models Reflect?” In Proceedings of the 40th International Conference on Machine Learning, 29971–4. PMLR, 2023. https://proceedings.mlr.press/v202/santurkar23a.html.
Everyone who used DeepSeek for any length of time would know that it's trained on same/similar parameters as ChaGPT. It's basically extremely liberal left leaning. A kind of a libtard. I guess that's simply how it's learning data is biased.
I was posting on X that Xi had 10 billion hidden in March and was banned from the app right after.
You Americans can indeed ask OpenAI about 'talking about Trump' under the spirit of 'freedom,' but just look at what your country has become now, LOL.
This is another level of cope
Clearly you've managed to get deep seek to peddle western propaganda, ask it about for example the AUP ( American Uni Party)