我有一个包含不同房产租金价格的数据集.它看起来是这样的:
data = {
'prices': [
'$350.00',
'$450.00 pw',
'$325 per week',
'$495pw - Views! White goods!',
'$460p/w + gst and outgoings',
'$300 wk',
'$390pw / $1695pcm',
'$180 pw / $782 pm',
'$375 Per Week/Fully Furnished',
'$350 pw + GST & Outgoings',
'APPLY NOW - From $270 per week',
'$185 per night',
'$400pw incl. power',
'$500 weekly',
'$600 per week pw',
'$850 per week (Fully furnished)',
'FROM $400PW, FURNITURE AND BILLS INCLUDED',
'THE DEAL- $780 PER WEEK',
'THE DEAL: $1,400 PER WEEK',
'$750/W Unfurnished',
'$320 - fully furnished pw',
'$330 PER WEEK | $1,430 P.C.M',
'Enquire Now: $690 per week',
'$460 per week / $1999 per month',
'$490 per week/Under Application approved',
'$1550pw - Location! Rare gem!',
'295 per week', # Example without a dollar sign
'unit 2 - $780pw unit 3 - $760pw', # Example with multiple prices
'$2500 pw high, $1600pw low,$380 pn', # Example with multiple prices
'from $786 - $1572 per week', # Example with multiple prices
'$590 to $639', # Example with a range
'$280 - $290 pw' # Example with a range
]
}
我的目标是清理这个"价格"列,以便只显示每周的租金价格.
我未能管理最后五种数据,以下是我所做的:
df = pd.DataFrame(data)
def extract_weekly_price(text):
price_match = re.search(r'\$?([\d,]+)', text)
if price_match:
price_str = price_match.group(1)
price = int(price_str.replace(',', ''))
# convert to weekly if not
if re.search(r'(per week|p\.w\.|p/w|pw|/w|weekly)', text):
return price
elif 'p.a' in text:
return price / 52
elif re.search(r'(p\.c\.m|pcm|mth|pm)', text):
return price / 4.33
elif 'per night' in text:
return price * 7
else:
return price
else:
return None
df['prices'] = df['prices'].str.lower()
df['Weekly_rent'] = df['prices'].apply(extract_weekly_price).round(3)
我如何修改我的代码,以便我可以获得这些数据的平均每周价格,范围如‘590美元到639美元’或‘$280-$290 PW’?如果你能帮忙,我将不胜感激.