[Home]RésuméDigéréRésultatsEmpiriques

MeatballWiki | RecentChanges | Random Page | Indices | Categories

Cette page a démarré sur DigestedSummaryEmpiricalResults

Il y a un objectif, un moyen empirique de démontrer l'état actuel de l'expérience du RésuméDigéré. Nous pouvons passer à travers tous les résumés digérés écrits durant les derniers mois et les analyser autour de quelques dimensions-clés. La plus importante à mes yeux est la fréquence à laquelle les résumés sont digérés. Une autre dimension-clé est la fréquence à laquelle les résumés sont chaînés (par exemple simplement ajoutés ; souvent avec des points virgules ; comme ceci). Aussi, la fréquence à laquelle les CatégoriesModification sont utilisées, et lesquelles sont les plus utilisées. Pour finir, nous devrions analyser cela par auteur pour voir s'il y a des biais fondés sur l'expertise.

Nous devrions être aussi ouverts à d'autres comportements sociaux qui ont bien sûr émergé. Je ne crois pas que cette liste de questions soit exhaustive.

Nous devrions aussi comparer le taux auquel les modifications des résumés digérés sont modifiés par un éditeur comparé à la fréquence à laquelle les éditeurs ont écrit un RésuméModification? avant le que le RésuméDigéré ne soit introduit. Un taux de modification plus faible suggérerait que le format de nouveau résumé est trop confus. (De la même manière, nous pouvons analyser cela par auteur).

Je pense que ce que nous découvrirons est qu'en ayant un format de résumé ouvert, nous apprendrons comment les personnes veulent véritablement structurer leurs informations de modification. Nous pouvons ensuite décider quelle est la meilleure approche pour cela. -- SunirShah

J'ai écrit un simple algorithme pour décider si un résumé était digéré ou non : si les caractères du précédent digest ne peuvent pas être trouvés dans le même ordre (bien que non nécessairement adjacent) dans le nouveau digest, alors le résumé a été digéré, par exemple ne pourrait pas être assemblé en concaténant simplement des résumés d'édition. J'ai catégorisé manuellement ceux-ci pour déterminer exactement leur type.

Ceci étant dit, j'ai analysé 21846 révisions couvrant 2159 pages, extraite de [1]. J'ai divisé l'intervalle de temps en trois sections :

Je n'ai pas complètement les chiffres de la période d'essai, je les aurai bientôt. En attendant, appréciez les belles statistiques :) -- ChrisPurcell


Discussion sur le premier tour d'analyse

Chris, wow. This is fantastic! Great job! This is incredibly thorough, more than I even imagined.

I realized that your algorithm for deciding 'digestion' also includes people who totally replace the summary with something new and different. One way to account for that is to take a moderately sized random sample of 'digested' changes and assess (manually) on a percentage basis how many are actually digested vs. totally changed vs. whatever. We might assume that the rate remains constant over the period for expediency, even though that itself is not clear. -- SunirShah

I've analysed the "digested" summaries from June 27 onwards, categorizing them manually. 125 were spam-related, the overwhelming majority being restoring the digest after a spam attack, the rest being digest mangling during an attack. In 6, no real changes were made, only whitespace removal.

The remainder I split between mistaken changes, where people replaced the digest field when they shouldn't have, and deliberate digests, where previous summaries were rewritten or even replaced entirely based on sound editorial planning with TheAudience in mind. Note that this is an entirely subjective opinion, as one man's deliberate choice is another's mistake. I found 81 mistakes and 165 deliberate digests.

Many mistakes were of course made by newbies. However, a significant fraction were deliberately made by oldbies making up for the lack of the old non-digested summary by replacing the digest each time, usually on French CopyEdit/PageTranslation pages. Hence the subjectivity: I feel such use is inappropriate for TheAudience reading RecentChanges.

The most interesting use of digests in my opinion is as a kind of WebLog update flag: one posts only the latest update in the digest, as anyone who comes to read it will find any backlog they may have missed.

Note that some digested changes may be being misclassified as trivial, as the simple algorithm I used will not flag when a later editor adds an EditCategory to a previous editor's post unless they also change the digest text. Nevertheless, such a change could not be assembled by concatenating edit summaries, so may be considered digestion. Not sure if I'll take the time to put numbers to this hypothesis. -- ChrisPurcell

So, to summarize the conclusion here, in the most recent period, roughly 3% of the time people digested the changes and 25% of the time people simply appended their changes. That indicates a significant drop from before DigestedSummary was introduced when we had change summaries for more than 50% of edits. However, we now have a much better picture of the kinds of things that people want to do with summaries. How can we adapt the workflow to achieve a digestion rate of 10%, which I think would result in a self-sustaining social norm and community practice? -- SunirShah

I think the simplest way would be for change summaries to be focused on the particular change, but then aggregate these changes as a bulleted list rendered on RecentChanges. This bulleted list would really be a wikitext buffer that can then be digested through another interface, accessible through RecentChanges itself. -- SunirShah

You have the statistics wrong, because I presented them a bit impenetrably before. 3% were indeed digestions, but 44% of the time people added their changes.

I'm not sure where 10% comes from, seems a bit arbitrary. I think we should take a look at the distribution of changes between summary field replacement before using a pull-the-rabbit-out-of-the-hat number :) I'll get on it soon. -- ChrisPurcell

Ok, that's a lot better. It is still a statistically significant amount. Of course, I have no idea how to compute p scores, so I'm just guessing a drop from 55% to 44% is significant. The 10% number is out of my ass, but it follows from the wisdom of the TeethToTailRatio. I still wonder if most of the summarizations are not digests, do we need to digest from the edit window or from somewhere more convenient, like RecentChanges? After all, digesting a long chain of changes is orthogonal to actually editing the page text. That should increase the digestion rate, I think. And if we are simply chaining the change summaries, we can make it easier to read by putting them in a nice clean list rather than separating them with semi-colons. -- SunirShah

Perhaps the numbers you want to compare are "16%" and "15%" - the proportion of summary-less modifications before and after digests. Or "29%" and "29%" - the proportion of summary-less follow-on edits. Given that I didn't categorize any of that 55%, it's quite likely that a good proportion of it was spam. (There are no edit categories to help out on this decision, and I'm not going to manually scan 8360 pages!) That would make the figures before and after fairly comparable.

We could allow digestion from RecentChanges as well as the edit page; I don't think the numbers support removing the simple digest system from the edit page. Or we could allow digestion from the history page, as that has more pertinent information to hand, and is the place I often find myself on when I want to do some digesting. -- ChrisPurcell


LangueFrançaise PageTranslation DigestedSummaryEmpiricalResults

Discussion

MeatballWiki | RecentChanges | Random Page | Indices | Categories
Edit text of this page | View other revisions | Search MetaWiki
Search: