People’s Gazette, a Lagos-based online media company, faces a lawsuit from lawyers of Peter Obi, the presidential candidate of the Labour Party, over the audio the platform published, purporting to reveal a private conversation between Obi and Bishop David Oyedepo, the founder of Winners Chapel, one of the largest churches in Nigeria.
In the leaked audio, one of the men whose voice shared keen similarities to Obi’s was urging the other man who is supposed to be Oyedepo to canvas for Christian votes for him. As of April 10, the leaked audio has been viewed over 10.3 million times, according to Twitter stats.
The audio, which has been denied by the presidential candidate, was described as “deepfake” by his party.
“Let me reiterate that the audio call being circulated is fake, and at no time throughout the campaign and now did I ever say, think, or even imply that the 2023 election is, or was, a religious war. The attempts to manipulate Nigerians is very sad and wicked. Our legal team has been instructed to take appropriate legal actions against People’s Gazette and others,” Obi said in a statement.
A statement released by the Labour Party described the audio as part of the continued attempts by the All Progressives Congress to tarnish the image of its presidential flagbearer.
“From the show of shame in Port Harcourt to the drama in the Ibom Air aircraft, both of which they contrived, they have now moved to the circulation of a deep fake audio file aimed at promoting religious tension in the country,” Diran Onifade, head of media and communications of the Obi-Datti Presidential campaign, and Yunusa Tanko, the campaign’s chief spokesperson, said in a joint statement.
While People’s Gazette has said it stands by the authenticity of the audio, the media company has suspended the reporter who wrote the report that came with the leaked audio, over “his conduct online that violated the newspaper’s social media policy and called into question its integrity.”
What is a deepfake voice?
A deepfake voice or voice cloning refers to a voice that closely mimics a real person’s voice. Although synthetic, the voice is humanlike and can accurately replicate tonality, accents, cadence, and other unique characteristics.
Creating a deepfake requires the use of artificial intelligence (AI) technology and elaborate computing power. In other words, deepfakes use AI to generate completely new videos or audio, with the goal of portraying something that did not actually happen in reality.
According to experts, cloning a person’s voice is not a walk in the park. It could take weeks to do successfully and apart from specialised tools and software, deepfakes require training data. That often means having sufficient recordings of the target person’s voice. However, there are new tools being deployed to shorten the time it takes to clone a person’s voice. For example, Resemble AI is considered one of the most powerful audio software for creating deepfake recordings. The cloning software doesn’t need vast amounts of data before it can start cloning.
Deepfakes and generative adversarial networks came out in the 2010s with the use of machine learning algorithms. Deep neural networks, for example, can generate hyper-realistic fabricated images and videos by superimposing one person’s face onto another’s body or synthesising entirely new content also known as deepfakes.
Why deepfake in Nigeria’s politics matters
Although there is no concrete evidence that shows that the audio is fake, as the voices can easily be passed off as belonging to Obi and Oyedepo, in a world where artificial intelligence is being deployed in ways that are unimaginable, anything becomes possible.
A Twitter handle that goes by the name Democracy Watchman said it conducted a forensic audit on the audio released by People’s Gazette. The goal was to prove the authenticity of the voices in the audio and whether they have been manipulated.
“The first step is to analyse the audio itself for any form of manipulation. Is this an authentic audio that has been cut and joined or is this an entirely scripted conversation that never took place?” Democracy Watchman said.
The forensic analysis was conducted using Adobe Premier Pro, a timeline-based and non-linear video editing software application developed by Adobe Inc. Adobe is not new in the deepfake landscape. In 2021, the company built a deepfake software known as Project Morpheus, which is a video version of the company’s Neural Filters, introduced in Photoshop in 2020. The software uses machine learning to adjust a subject’s appearance, tweaking things like their age, hair colour, and facial expression (to change a look of surprise into one of anger, for example).
According to Democracy Watchman, a quick “Scene Edit Detection” on Adobe Premier Pro shows that the audio was manipulated.
“Premie Pro was able to detect 4 different audios that have been put together to form the 4 minutes, 17 seconds clip,” the Twitter handle said.
Also, the audit found that the audio was compressed multiple times in order to move it from one platform to another, ensuring that all metadata was removed from it.
A second analysis was conducted on the audio to prove that it was generated by artificial intelligence and therefore an outright falsehood. The analyst said he relied on AI by running a scan of the leaked audio on Gitub’s dessa-oss, an open-source deepfake detection tool. When ran through dessa-oss, it showed that the audio was indeed AI-generated and fake audio.
A new video of analysis of the audio released last Tuesday suggests that an audio of Atiku Abubakar, the presidential candidate of the People’s Democratic Party (PDP); his running mate and governor of Delta State, Ifeanyi Okowa; and Aminu Tambuwal, governor of Soloto was created with deepfake technology.
Read also: Deepfakes: Hollywood’s quest to create the perfect digital human
The Atiku, Okowa, and Tambuwal audio, which was released prior to the February 25 presidential election had the three politicians supposedly plotting to rig the election by compromising the governor of the Central Bank of Nigeria and the chairman of the Independent National Electoral Commission. Like Obi, Atiku dismissed the audio, describing it as fake and the work of political detractors.
The use of deepfakes in Nigerian politics could mean the country’s chequered history of political manipulation has entered the next level.
Globally, deepfakes are considered a dilemma for the political environment as their goal is often to manipulate the public perception of a particular public figure. Back in 2018, for instance, a Belgian political party dumped a video of former US President, Donald Trump, giving a speech calling on Belgium to withdraw from the Paris Climate Agreement. Trump never gave the speech. It was a deepfake.
A special report by Oxford University noted that one of the prominent deployments of deepfakes to shape political discourse was in May 2019 when a manipulated video of Nancy Pelosi, former US House Speaker, was released on the internet showing her to appear intoxicated and slurring her words. The video was watched more than 2.5 million times on Facebook in a matter of days and shared by prominent political leaders. Although the video was later proven to be fake, the Oxford report said it foreshadowed one type of disinformation that could disrupt political discourse and future elections.
Over the years, several deepfakes of prominent politicians in the US have gone viral on social media. In February 2023, PolitiFact, a fact-checking nonprofit operated by the Poynter Institute, had to debunk a deepfake video of Senator Elizabeth Warren that used an interview she did with MSNBC in which it appeared like the senator was saying Republicans should not be allowed to vote.
Experts say the danger is in the fact that it is often difficult for people to unsee what they have seen before.
“For those who do journalism, it means that every time we use a tool (even if it were only our interpretation, our brain), we should remember that we are altering reality by using it,” said Alberto Puliafito, director and media analyst and editor-in-chief of Slow News, an Italian media company. “We should not forget how we inform and accidentally misinform while informing: what we choose to say and what we omit – not out of malice but because it is not part of our view, or maybe it does not enter the viewfinder through which we observe the reality, or it doesn’t fit our story – shape the story we are telling.”