Hello everyone,
I hope this message finds you well. I was hoping if someone could assist me on this error I have encountered when attempting to clean data. The error seems to advise that the length of values (1000) does not match the length of index (250). Would anyone please kindly suggest the code to correct this error and necessary alterations.
Thank you in advance
That value assigned in line one is an l (el), right?
It looks like its length doesn’t match the length of the column df['clean_input']
.
Consider changing that variable’s name. An l looks too similar to a 1 in some fonts.
1 Like
Well, yes. It is telling you that l
contains 1000 values, but you want to put them, one value per row, into the rows of the df['clean_input']
column. This does not make sense, because the DataFrame only has 250 rows. A DataFrame has to keep a rectangular shape; it can’t have extra rows in one of the columns.
There is no way that anyone else can “suggest the code to correct this error and necessary alterations”, because there is no way that we can know any of the following things:
-
Where did the dataframe come from, and why does it have the number of rows that it does?
-
Where did
l
come from, and why does it have the number of values that it does? -
What is the reason why we want to run the code
df['clean_input'] = l
? What should happen as a result, and why? -
Should
l
have as many rows as it does? Should the DataFrame have as few rows as it does? And if both of those make sense, then why should it make sense to dodf['clean_input'] = l
? For example, do we only want to use certain values froml
? If so, what is the rule that tells us which values to use?
Dear Karl,
Thanks for your message. I do realise there seems to be less information regarding the error. I am new to Python coding so please bear with me whilst I explain the issue in more depth. The data frame has come from the IMDB site and is a recommendation engine as part of a project tutorial which I have followed carefully however, I do not understand the issue myself. I did check online for a solution and it was advised that extra columns could be added using the pd.Series function however, being a very new starter in Python I am unsure about the context and structure of using this code in the programme. I have attempted to change the l to 1 however I have received a new error stating “KeyError: “[‘clean_input ‘] not in index”. I am unsure what this would mean and hoping if someone could explain further?
Thank you all for your help
Hi Franklinvp,
Thank you for your message. Yes that value is an (el) and it has now been changed.
Thanks!
Karl did not say to change the “l” (el) to “1” (one), but that using “l” was a bad choice for a name because the letter looks a lot like the digit in some fonts, e.g. l
(el) and 1
(one).
As for the error message “['clean_input '] not in index”
, is that exactly what you see? If yes, look closely at it: there’s a space in the quotes after the word “clean_input”.