Jump to content

Wikipedia talk:Reliable sources

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Your opinion is requested[edit]

Hi. I need editors with expertise/experience in IRS-related matters in a consensus discussion on the Joan of Arc talk page. Someone added a passage in the section on Joan's cross-dressing, and cited as a source the late Andrea Dworkin, whose Wikipedia article describes her as a "radical feminist" who was criticzed for her belief that "all sex is rape", which prompted one critic to label her "a preacher of hate." Dworkin was not a historian, nor trained in history, as her BA was in literature.[1] Could conscientious editors please read what I've presented at the discussion, and then offer their views? Thanks. Nightscream (talk) 03:55, 30 March 2024 (UTC)[reply]

Upgrading SCIRS to a guideline[edit]

A proposal has been made at Wikipedia:Village pump (policy)#Upgrade SCIRS to a guideline to upgrade Wikipedia:Identifying reliable sources (science) to a guideline. To keep discussion in on place, please leave any comments you have there rather than here. Thryduulf (talk) 15:13, 11 April 2024 (UTC)[reply]

Semi-protected edit request on 20 April 2024[edit]

162.71.236.123 (talk) 22:44, 20 April 2024 (UTC) In the 2016 League Of Legends Championship, $380,250 USD was split between 3rd and 4th place instead of 3rd place getting all of it.[reply]

  •  Not done:You did not list the page you want modified, nor have you provided a reliable sourceSWATJester Shoot Blues, Tell VileRat! 23:01, 20 April 2024 (UTC)[reply]

Dark Matter[edit]

Dark matter is different than Anti-Matter.Big Debstoh777 (talk) 03:15, 21 April 2024 (UTC)[reply]

@Debstoh777: You're posting this in the wrong location. This page is for the discussion of the Reliable Sources guideline. SWATJester Shoot Blues, Tell VileRat! 04:08, 21 April 2024 (UTC)[reply]
Thank you for that honestly because I'm very new to all this and still learning. Debstoh777 (talk) 06:40, 21 April 2024 (UTC)[reply]

AI-written citations?[edit]

I was adding an event to an article (Special:Diff/1220193358) when I noticed that the article I was reading as a source, and planning to cite, was tagged as being written by AI on the news company's website. I've looked around a bit, skimmed Wikipedia: Using neural network language models on Wikipedia, WP:LLM, WP:AI, WP:RS and this Wikimedia post, but couldn't find anything directly addressing whether it's ok to cite articles written by AI. Closest I could find is here on WP:RS tentatively saying "ML generation in itself does not necessarily disqualify a source that is properly checked by the person using it" and here on WP:LLM, which clearly states "LLMs do not follow Wikipedia's policies on verifiability and reliable sourcing.", but in a slightly different context, so I'm getting mixed signals. I also asked Copilot and GPT3.5, which both said AI-written citations neither explicitly banned nor permitted, with varying levels of vaguery.

For my specific example, I submitted it but put "(AI)" after the name, but I wanted to raise this more broadly because I'm not sure what to do. My proposal is what I did, use them but tag them as AI in the link, but I'm curious to hear other suggestions.

I've put this on the talk pages in Wikipedia:Using neural network language models on Wikipedia and Wikipedia:Reliable sources. SqueakSquawk4 (talk) 11:36, 22 April 2024 (UTC)[reply]

For me it comes down to a case by case basis. If AI is being used as part of the process, but ultimately the article is from a real person and editor then it's probably fine. The issue comes from articles completely written by AI with little or no oversight.
The site has an AI disclaimer[2] where they say they only use AI in the first way, not the latter. So on that point I would think it should be ok. -- LCU ActivelyDisinterested «@» °∆t° 13:02, 22 April 2024 (UTC)[reply]
@SqueakSquawk4, do you absolutely need that source? If you can find a better one, then I suggest using the better one instead. WhatamIdoing (talk) 02:03, 24 April 2024 (UTC)[reply]
A) I kinda do, it's the only citation I found with everything in the same place. If I took it out I'd have to put in 2 or 3 seperate citations to not leave something uncited.
B) I was going trying to ask more generally, with the one I found as just an example rather than really the focus of what I was asking.
C) @ActivelyDisinterested Thanks, didn't spot that. SqueakSquawk4 (talk) 12:32, 25 April 2024 (UTC)[reply]
  • AI = NO Considering the 'hallucination" issue that LLMs have, and, in fact, considering how they are constructed at a base logic level, I would categorically treat any "AI" source as intrinsically non-reliable. If a news agency is found to be using "AI" constructed articles on a regular basis then that source should be deprecated. Simonm223 (talk) 12:42, 25 April 2024 (UTC)[reply]
    Simon, I think black-and-white rules are easy to understand, but hallucination is only an issue when it appears. AI sometimes generates false claims. If it's writing something you know to be true and non-hallucinated (e.g., because you've read the same claim in other sources, or because it's the kind of general, non-controversial knowledge that the Wikipedia:No original research says doesn't require a citation, like "The capital of France is Paris"), then that problem is irrelevant.
    @SqueakSquawk4, editors might accept this source, especially in light of what AD says. However, if the content is important to you, you might consider using the three other sources instead of (or in addition to) this one, to make it harder for someone to remove it on simplistic "all AI is wrong and bad" grounds.
    As a tangent, we've never defined reliable sources. Unlike an article, which would doubtless begin with a sentence like "A reliable source is...", this guideline begins with "Wikipedia articles should be based on reliable, published sources". I suggest that the actual definition, in practice, is "A reliable source is a published source that experienced Wikipedia editors accept as supporting the material it is cited for". Some editors strongly oppose AI-generated sources, and we can usually expect that some editors won't take time to understand the nuances behind using AI as a convenience vs using AI unsupervised to generate content wholesale.[*] Therefore, I'm uncertain whether it would considered reliable if it were ever seriously disputed.
    [*] This is happening in the real world, with a student accused of plagiarism without any evidence except Turnitin thinking it was AI-generated,[3] so it'll happen on wiki, too. WhatamIdoing (talk) 17:10, 25 April 2024 (UTC)[reply]
    I read on some AI-test tool I tried a caveat, something like "don't use this to punish students." Gråbergs Gråa Sång (talk) 18:22, 25 April 2024 (UTC)[reply]

This may be a crazy thought, but...[edit]

Can we assemble a master list of all sources used throughout all articles in the encyclopedia? With ~7,000,000 articles, some with no references or one reference, but others with hundreds of references, I would guess that there are about 50,000,000 references in Wikipedia. I would further guess that some of those (particularly databases) are heavily used, and could be normalized to a greater degree in some way (e.g., via templates). BD2412 T 17:53, 23 May 2024 (UTC)[reply]

Note: Wikipedia:Articles with the most references lists the most referenced ~875 articles/lists, with a total of ~543,000 references for that outlier group. BD2412 T 17:59, 23 May 2024 (UTC)[reply]
Curious… what do you mean by “normalized”? Blueboar (talk) 17:59, 23 May 2024 (UTC)[reply]
I mean that if the same source is used as a citation in hundreds of articles, the citation should present the same across those hundreds of articles (and could even be reduced to a template, kind of like the IMDb name template often used in external links). BD2412 T 18:16, 23 May 2024 (UTC)[reply]
Maybe not exactly the same, as the same source might be cited with a different page number, or a different excerpted quote. Barnards.tar.gz (talk) 21:17, 23 May 2024 (UTC)[reply]
One editors normalisation is another editors CITEVAR. I could see lots of discontent if references are normalised across articles with different established referencing styles. -- LCU ActivelyDisinterested «@» °∆t° 19:11, 23 May 2024 (UTC)[reply]
My foremost interest is in gathering the data, which we do not have at all. It may well be that there are links appearing as bare URLs in some articles (which is never preferrable) and nicely formatted in citations in others, or that there are sources where things like the date of the source or the spelling of the author's name are different across different articles, indicating that at least one of them is erroneous. BD2412 T 20:46, 23 May 2024 (UTC)[reply]
You could get one of the dumps, parse the XML to extract the <text> element for each article, and then apply further parsing to get:
  • all <ref> elements
  • anything that uses a recognised {{cite...}} template
  • anything that looks like a URL.
I expect there will be some niche citation styles that may not fit into the above, and some false positives like URLs mentioned in the infobox of an article about a website.
It's probably a perl oneliner... if you start sufficiently far to the left. Barnards.tar.gz (talk) 21:33, 23 May 2024 (UTC)[reply]
That's Greek to me. BD2412 T 23:15, 25 May 2024 (UTC)[reply]
That sounds like something m:WikiCite might have been interested in. WhatamIdoing (talk) 23:41, 25 May 2024 (UTC)[reply]
I'll mention WP:JCW here, thought it's not quite what the OP asked. Headbomb {t · c · p · b} 00:05, 26 May 2024 (UTC)[reply]
I think that @Ocaasi made a list of the most popular domain names. See Wikipedia:Vaccine safety/Reports. WhatamIdoing (talk) 21:33, 28 May 2024 (UTC)[reply]
I see. I gather that this list is for sources used in vaccination-related articles. Perhaps the exercise can be scaled up. BD2412 T 16:42, 13 June 2024 (UTC)[reply]
  • This idea makes me very nervous… I worry that as soon as we compile a “list of sources” it will turn into “THE list of (approved) sources”. I understand that this isn’t the intent here, but we have seen something similar happen with RSP. That page was first intended as nothing more than a quick reference aid (of sources that are frequently discussed). However, it has evolved into something else - a lot of editors think it is where you go to “vet” sources, and that it is a list of approved (and, more importantly, disapproved) sources.
Data collection is all fine and dandy, but it can be misused in ways not originally intended. Blueboar (talk) 17:55, 13 June 2024 (UTC)[reply]
  • Well, whatever the risks, I would say, it is better to know what "we" are doing than not (and that is even so, as it seems the "risk" is someone will say, it is used all over Wikipedia, so needs no evaluation here, (eg., it is somewhat curious that any news source is used for vaccinations, but at least we can now look and see how and whatnot.) Alanscottwalker (talk) 18:04, 13 June 2024 (UTC)[reply]

Two possible additions[edit]

Under Scholarship, the bullet point Reliable scholarship should consider that books on academic subjects are often reviewed in journals covering the appropriate academic discipline. These can often highlight the value of any particular book. Since some academic publishers seem to be less reliable on the quality of their output than they once were, this is a useful verification of the content of a book (versus a properly peer-reviewed paper).

In rare instances, a review may be so damning that we would probably all see the book in question as not being a suitable source. (See [4] for an example of such a review). Other reviews actually turn out to add to the content of an academic book by giving a second supporting opinion on some content. (See [5] for an example – search for "observations that may not be widely-understood and accepted, but are nevertheless accurate" to see this in action. This example also shows how a review might highlight the strengths and weaknesses of a work, so further helping the editor in how to use a source.)

Therefore I suggest the "Reliable scholarship" paragraph should have added:

  • Books are often reviewed in academic journals that cover their subject – these reviews may help an editor understand the strengths and weaknesses of the work in question.

The second suggestion is more concise. The last sentence of Citation counts should be expanded to say

  • The number of citations may be misleading if an author cites themselves often, or if a work is frequently cited by those who disagree with or disprove it.

Generally, to disagree with the work of others, you have to cite them. This obviously increases the citation count, especially if a lot of other authors publish in disagreement. ThoughtIdRetired TIR 15:14, 8 June 2024 (UTC)[reply]

Audit Bureau of Circulations, UK[edit]

In a nutshell, the Audit Bureau of Circulations is some sort of gold standard for auditing media in the United Kingdom, so I do not feel like I need to discuss its reliability at WP:RSN. However, I thought that, in this page and possibly elsewhere, there would be a complete database of reports and certificates for historical publications. I know the Informationsgemeinschaft zur Feststellung der Verbreitung von Werbeträgern has a website that keeps records of such with pages like this for example. With the British Audit Bureau of Circulations, I am not sure where to look. That is a shame, because it makes finding a newspaper's or magazine's dominance in the British media market difficult. The numbers are perhaps not the most important aspect of the publications by a long shot, but occasionally, they do get noted. Does ABC in fact keep such records, and if so, where does one look, or does one have to be a registered member to view them? FreeMediaKid$ 14:51, 13 June 2024 (UTC)[reply]