anjitaa Ответов: 1

Объясните реализацию подсчета слов с помощью hadoop framework?


What do you understand by Word Count implementation via Hadoop framework? Explain in detail


Что я уже пробовал:

I am not able to implement the Word Count implementation via the Hadoop framework?

Richard MacCutchan

Смотрите документацию Hadoop, там это должно быть объяснено.

1 Ответов

Рейтинг:
1

Bansal himani

"Word Count Implementation will be as follows:
For ex: Input File 1 contains data: “This is December Month.”
              Input File 2 contains data:  “December is the last month of the year.”

Step 1: Mapper will generate the following below output:
Input File 1 output
<this, 1>
<is, 1>
<December, 1>
<Month, 1>
Input File 2 output
<December, 1>
<is, 1>
<the, 1>
<last, 1>
<month, 1>
<of, 1>
<the, 1>
<year, 1>

Step2: Combiner/sorting is performed on the both the input files individually:
Input File 1 output:
<this, 1>
<is, 1>
<December, 1>
<Month, 1>
Input File 2 output
<December, 1>
<is, 1>
<the, 2>
<last, 1>
<month, 1>
<of, 1>
<year, 1>

Step3: Reducer will combine the output:
<this, 1>
<is, 2>
<December, 2>
<Month, 2>
<the, 2>
<of, 1>
<year, 1>

Final Output:
This 2 times
Is 2 times
December 2 times
Month 2 times
The 2 times
Of 1 times
Year 1 times

Assignment II - 3rd December
"